Overview

Brought to you by YData

Dataset statistics

Number of variables60
Number of observations584201
Missing cells12827277
Missing cells (%)36.6%
Total size in memory267.4 MiB
Average record size in memory480.0 B

Variable types

Text60

Dataset

DescriptionHerpetology NMNH Extant Specimen Records 0054921-241126133413365
URLhttps://doi.org/10.15468/dl.rf2che

Alerts

institutionID has constant value "urn:lsid:biocol.org:col:34871" Constant
collectionID has constant value "urn:uuid:cc104cbf-fd8e-4801-9b71-36731a7db1a0" Constant
institutionCode has constant value "USNM" Constant
collectionCode has constant value "HERP" Constant
datasetName has constant value "NMNH Extant Biology" Constant
kingdom has constant value "Animalia" Constant
phylum has constant value "Chordata" Constant
taxonRank has constant value "subspecies" Constant
recordNumber has 583925 (> 99.9%) missing values Missing
sex has 527948 (90.4%) missing values Missing
lifeStage has 539845 (92.4%) missing values Missing
associatedMedia has 579054 (99.1%) missing values Missing
associatedSequences has 583480 (99.9%) missing values Missing
occurrenceRemarks has 557618 (95.4%) missing values Missing
fieldNumber has 584193 (> 99.9%) missing values Missing
eventDate has 37781 (6.5%) missing values Missing
startDayOfYear has 55728 (9.5%) missing values Missing
endDayOfYear has 55637 (9.5%) missing values Missing
year has 37781 (6.5%) missing values Missing
month has 54300 (9.3%) missing values Missing
day has 85891 (14.7%) missing values Missing
waterBody has 555994 (95.2%) missing values Missing
islandGroup has 564324 (96.6%) missing values Missing
island has 576136 (98.6%) missing values Missing
stateProvince has 17001 (2.9%) missing values Missing
county has 191557 (32.8%) missing values Missing
minimumElevationInMeters has 332173 (56.9%) missing values Missing
maximumElevationInMeters has 333225 (57.0%) missing values Missing
verbatimElevation has 331608 (56.8%) missing values Missing
decimalLatitude has 162901 (27.9%) missing values Missing
decimalLongitude has 162901 (27.9%) missing values Missing
geodeticDatum has 438700 (75.1%) missing values Missing
coordinateUncertaintyInMeters has 439218 (75.2%) missing values Missing
verbatimLatitude has 334540 (57.3%) missing values Missing
verbatimLongitude has 334562 (57.3%) missing values Missing
georeferenceProtocol has 439136 (75.2%) missing values Missing
georeferenceRemarks has 443625 (75.9%) missing values Missing
identificationQualifier has 583784 (99.9%) missing values Missing
typeStatus has 570681 (97.7%) missing values Missing
identifiedBy has 584125 (> 99.9%) missing values Missing
specificEpithet has 13122 (2.2%) missing values Missing
infraspecificEpithet has 556206 (95.2%) missing values Missing
taxonRank has 556206 (95.2%) missing values Missing
gbifID has unique values Unique
occurrenceID has unique values Unique
catalogNumber has unique values Unique

Reproduction

Analysis started2025-01-14 16:50:03.383103
Analysis finished2025-01-14 16:50:17.330904
Duration13.95 seconds
Software versionydata-profiling vv4.12.1
Download configurationconfig.json

Variables

gbifID
Text

Unique 

Distinct584201
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-14T11:50:17.709923image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters5842010
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique584201 ?
Unique (%)100.0%

Sample

1st row1317203362
2nd row1317203927
3rd row1317204107
4th row1322537851
5th row1322539748
ValueCountFrequency (%)
1317203362 1
 
< 0.1%
1322539748 1
 
< 0.1%
1322560470 1
 
< 0.1%
1322558547 1
 
< 0.1%
1317274722 1
 
< 0.1%
1317214758 1
 
< 0.1%
1317204107 1
 
< 0.1%
1322537851 1
 
< 0.1%
1317211425 1
 
< 0.1%
1322569185 1
 
< 0.1%
Other values (584191) 584191
> 99.9%
2025-01-14T11:50:18.224000image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1289572
22.1%
3 931906
16.0%
2 745858
12.8%
8 464209
 
7.9%
9 461174
 
7.9%
0 439271
 
7.5%
7 430436
 
7.4%
4 371688
 
6.4%
5 355028
 
6.1%
6 352868
 
6.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5842010
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1289572
22.1%
3 931906
16.0%
2 745858
12.8%
8 464209
 
7.9%
9 461174
 
7.9%
0 439271
 
7.5%
7 430436
 
7.4%
4 371688
 
6.4%
5 355028
 
6.1%
6 352868
 
6.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5842010
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1289572
22.1%
3 931906
16.0%
2 745858
12.8%
8 464209
 
7.9%
9 461174
 
7.9%
0 439271
 
7.5%
7 430436
 
7.4%
4 371688
 
6.4%
5 355028
 
6.1%
6 352868
 
6.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5842010
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1289572
22.1%
3 931906
16.0%
2 745858
12.8%
8 464209
 
7.9%
9 461174
 
7.9%
0 439271
 
7.5%
7 430436
 
7.4%
4 371688
 
6.4%
5 355028
 
6.1%
6 352868
 
6.0%
Distinct11116
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-14T11:50:18.422083image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters11099819
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6239 ?
Unique (%)1.1%

Sample

1st row2022-03-25 16:29:00
2nd row2022-12-14 12:20:00
3rd row2022-07-25 13:54:00
4th row2022-03-25 16:12:00
5th row2022-03-25 16:41:00
ValueCountFrequency (%)
2022-08-17 164186
 
14.1%
2022-03-25 159648
 
13.7%
2018-10-02 114364
 
9.8%
2018-10-01 11147
 
1.0%
2022-09-02 9885
 
0.8%
2024-09-04 7897
 
0.7%
2022-12-02 7119
 
0.6%
2014-08-26 6094
 
0.5%
2020-09-23 6045
 
0.5%
2014-08-28 5728
 
0.5%
Other values (2040) 676289
57.9%
2025-01-14T11:50:18.670392image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 2825947
25.5%
2 1945269
17.5%
1 1362306
12.3%
- 1168402
10.5%
: 1168402
10.5%
584201
 
5.3%
8 454189
 
4.1%
5 397958
 
3.6%
3 369937
 
3.3%
7 249238
 
2.2%
Other values (3) 573970
 
5.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8178814
73.7%
Dash Punctuation 1168402
 
10.5%
Other Punctuation 1168402
 
10.5%
Space Separator 584201
 
5.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2825947
34.6%
2 1945269
23.8%
1 1362306
16.7%
8 454189
 
5.6%
5 397958
 
4.9%
3 369937
 
4.5%
7 249238
 
3.0%
4 229171
 
2.8%
6 174007
 
2.1%
9 170792
 
2.1%
Dash Punctuation
ValueCountFrequency (%)
- 1168402
100.0%
Other Punctuation
ValueCountFrequency (%)
: 1168402
100.0%
Space Separator
ValueCountFrequency (%)
584201
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 11099819
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2825947
25.5%
2 1945269
17.5%
1 1362306
12.3%
- 1168402
10.5%
: 1168402
10.5%
584201
 
5.3%
8 454189
 
4.1%
5 397958
 
3.6%
3 369937
 
3.3%
7 249238
 
2.2%
Other values (3) 573970
 
5.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11099819
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2825947
25.5%
2 1945269
17.5%
1 1362306
12.3%
- 1168402
10.5%
: 1168402
10.5%
584201
 
5.3%
8 454189
 
4.1%
5 397958
 
3.6%
3 369937
 
3.3%
7 249238
 
2.2%
Other values (3) 573970
 
5.2%

institutionID
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-14T11:50:18.736751image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length29
Median length29
Mean length29
Min length29

Characters and Unicode

Total characters16941829
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowurn:lsid:biocol.org:col:34871
2nd rowurn:lsid:biocol.org:col:34871
3rd rowurn:lsid:biocol.org:col:34871
4th rowurn:lsid:biocol.org:col:34871
5th rowurn:lsid:biocol.org:col:34871
ValueCountFrequency (%)
urn:lsid:biocol.org:col:34871 584201
100.0%
2025-01-14T11:50:18.841257image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 2336804
13.8%
: 2336804
13.8%
l 1752603
 
10.3%
i 1168402
 
6.9%
r 1168402
 
6.9%
c 1168402
 
6.9%
g 584201
 
3.4%
7 584201
 
3.4%
8 584201
 
3.4%
4 584201
 
3.4%
Other values (8) 4673608
27.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 11099819
65.5%
Other Punctuation 2921005
 
17.2%
Decimal Number 2921005
 
17.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 2336804
21.1%
l 1752603
15.8%
i 1168402
10.5%
r 1168402
10.5%
c 1168402
10.5%
g 584201
 
5.3%
u 584201
 
5.3%
b 584201
 
5.3%
d 584201
 
5.3%
s 584201
 
5.3%
Decimal Number
ValueCountFrequency (%)
7 584201
20.0%
8 584201
20.0%
4 584201
20.0%
3 584201
20.0%
1 584201
20.0%
Other Punctuation
ValueCountFrequency (%)
: 2336804
80.0%
. 584201
 
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11099819
65.5%
Common 5842010
34.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 2336804
21.1%
l 1752603
15.8%
i 1168402
10.5%
r 1168402
10.5%
c 1168402
10.5%
g 584201
 
5.3%
u 584201
 
5.3%
b 584201
 
5.3%
d 584201
 
5.3%
s 584201
 
5.3%
Common
ValueCountFrequency (%)
: 2336804
40.0%
7 584201
 
10.0%
8 584201
 
10.0%
4 584201
 
10.0%
3 584201
 
10.0%
. 584201
 
10.0%
1 584201
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16941829
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 2336804
13.8%
: 2336804
13.8%
l 1752603
 
10.3%
i 1168402
 
6.9%
r 1168402
 
6.9%
c 1168402
 
6.9%
g 584201
 
3.4%
7 584201
 
3.4%
8 584201
 
3.4%
4 584201
 
3.4%
Other values (8) 4673608
27.6%

collectionID
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-14T11:50:18.894134image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length45
Median length45
Mean length45
Min length45

Characters and Unicode

Total characters26289045
Distinct characters20
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowurn:uuid:cc104cbf-fd8e-4801-9b71-36731a7db1a0
2nd rowurn:uuid:cc104cbf-fd8e-4801-9b71-36731a7db1a0
3rd rowurn:uuid:cc104cbf-fd8e-4801-9b71-36731a7db1a0
4th rowurn:uuid:cc104cbf-fd8e-4801-9b71-36731a7db1a0
5th rowurn:uuid:cc104cbf-fd8e-4801-9b71-36731a7db1a0
ValueCountFrequency (%)
urn:uuid:cc104cbf-fd8e-4801-9b71-36731a7db1a0 584201
100.0%
2025-01-14T11:50:19.002810image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 2921005
 
11.1%
- 2336804
 
8.9%
u 1752603
 
6.7%
c 1752603
 
6.7%
7 1752603
 
6.7%
0 1752603
 
6.7%
b 1752603
 
6.7%
d 1752603
 
6.7%
4 1168402
 
4.4%
f 1168402
 
4.4%
Other values (10) 8178814
31.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 11684020
44.4%
Decimal Number 11099819
42.2%
Dash Punctuation 2336804
 
8.9%
Other Punctuation 1168402
 
4.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 1752603
15.0%
c 1752603
15.0%
b 1752603
15.0%
d 1752603
15.0%
f 1168402
10.0%
a 1168402
10.0%
i 584201
 
5.0%
r 584201
 
5.0%
e 584201
 
5.0%
n 584201
 
5.0%
Decimal Number
ValueCountFrequency (%)
1 2921005
26.3%
7 1752603
15.8%
0 1752603
15.8%
4 1168402
 
10.5%
8 1168402
 
10.5%
3 1168402
 
10.5%
9 584201
 
5.3%
6 584201
 
5.3%
Dash Punctuation
ValueCountFrequency (%)
- 2336804
100.0%
Other Punctuation
ValueCountFrequency (%)
: 1168402
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 14605025
55.6%
Latin 11684020
44.4%

Most frequent character per script

Common
ValueCountFrequency (%)
1 2921005
20.0%
- 2336804
16.0%
7 1752603
12.0%
0 1752603
12.0%
4 1168402
 
8.0%
: 1168402
 
8.0%
8 1168402
 
8.0%
3 1168402
 
8.0%
9 584201
 
4.0%
6 584201
 
4.0%
Latin
ValueCountFrequency (%)
u 1752603
15.0%
c 1752603
15.0%
b 1752603
15.0%
d 1752603
15.0%
f 1168402
10.0%
a 1168402
10.0%
i 584201
 
5.0%
r 584201
 
5.0%
e 584201
 
5.0%
n 584201
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 26289045
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 2921005
 
11.1%
- 2336804
 
8.9%
u 1752603
 
6.7%
c 1752603
 
6.7%
7 1752603
 
6.7%
0 1752603
 
6.7%
b 1752603
 
6.7%
d 1752603
 
6.7%
4 1168402
 
4.4%
f 1168402
 
4.4%
Other values (10) 8178814
31.1%

institutionCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-14T11:50:19.045901image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters2336804
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUSNM
2nd rowUSNM
3rd rowUSNM
4th rowUSNM
5th rowUSNM
ValueCountFrequency (%)
usnm 584201
100.0%
2025-01-14T11:50:19.139429image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
U 584201
25.0%
S 584201
25.0%
N 584201
25.0%
M 584201
25.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2336804
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
U 584201
25.0%
S 584201
25.0%
N 584201
25.0%
M 584201
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2336804
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 584201
25.0%
S 584201
25.0%
N 584201
25.0%
M 584201
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2336804
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U 584201
25.0%
S 584201
25.0%
N 584201
25.0%
M 584201
25.0%

collectionCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-14T11:50:19.178917image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters2336804
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowHERP
2nd rowHERP
3rd rowHERP
4th rowHERP
5th rowHERP
ValueCountFrequency (%)
herp 584201
100.0%
2025-01-14T11:50:19.274964image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
H 584201
25.0%
E 584201
25.0%
R 584201
25.0%
P 584201
25.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2336804
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
H 584201
25.0%
E 584201
25.0%
R 584201
25.0%
P 584201
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2336804
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
H 584201
25.0%
E 584201
25.0%
R 584201
25.0%
P 584201
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2336804
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
H 584201
25.0%
E 584201
25.0%
R 584201
25.0%
P 584201
25.0%

datasetName
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-14T11:50:19.316917image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters11099819
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNMNH Extant Biology
2nd rowNMNH Extant Biology
3rd rowNMNH Extant Biology
4th rowNMNH Extant Biology
5th rowNMNH Extant Biology
ValueCountFrequency (%)
nmnh 584201
33.3%
extant 584201
33.3%
biology 584201
33.3%
2025-01-14T11:50:19.414649image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 1168402
 
10.5%
1168402
 
10.5%
t 1168402
 
10.5%
o 1168402
 
10.5%
M 584201
 
5.3%
H 584201
 
5.3%
E 584201
 
5.3%
x 584201
 
5.3%
a 584201
 
5.3%
n 584201
 
5.3%
Other values (5) 2921005
26.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6426211
57.9%
Uppercase Letter 3505206
31.6%
Space Separator 1168402
 
10.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 1168402
18.2%
o 1168402
18.2%
x 584201
9.1%
a 584201
9.1%
n 584201
9.1%
i 584201
9.1%
l 584201
9.1%
g 584201
9.1%
y 584201
9.1%
Uppercase Letter
ValueCountFrequency (%)
N 1168402
33.3%
M 584201
16.7%
H 584201
16.7%
E 584201
16.7%
B 584201
16.7%
Space Separator
ValueCountFrequency (%)
1168402
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9931417
89.5%
Common 1168402
 
10.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 1168402
11.8%
t 1168402
11.8%
o 1168402
11.8%
M 584201
 
5.9%
H 584201
 
5.9%
E 584201
 
5.9%
x 584201
 
5.9%
a 584201
 
5.9%
n 584201
 
5.9%
B 584201
 
5.9%
Other values (4) 2336804
23.5%
Common
ValueCountFrequency (%)
1168402
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11099819
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 1168402
 
10.5%
1168402
 
10.5%
t 1168402
 
10.5%
o 1168402
 
10.5%
M 584201
 
5.3%
H 584201
 
5.3%
E 584201
 
5.3%
x 584201
 
5.3%
a 584201
 
5.3%
n 584201
 
5.3%
Other values (5) 2921005
26.3%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-14T11:50:19.465552image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length17
Mean length17.00021739
Min length17

Characters and Unicode

Total characters9931544
Distinct characters19
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPreservedSpecimen
2nd rowPreservedSpecimen
3rd rowPreservedSpecimen
4th rowPreservedSpecimen
5th rowPreservedSpecimen
ValueCountFrequency (%)
preservedspecimen 584074
> 99.9%
machineobservation 127
 
< 0.1%
2025-01-14T11:50:19.568942image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 2920624
29.4%
r 1168275
11.8%
i 584328
 
5.9%
n 584328
 
5.9%
c 584201
 
5.9%
s 584201
 
5.9%
v 584201
 
5.9%
m 584074
 
5.9%
P 584074
 
5.9%
p 584074
 
5.9%
Other values (9) 1169164
11.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8763142
88.2%
Uppercase Letter 1168402
 
11.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 2920624
33.3%
r 1168275
13.3%
i 584328
 
6.7%
n 584328
 
6.7%
c 584201
 
6.7%
s 584201
 
6.7%
v 584201
 
6.7%
m 584074
 
6.7%
p 584074
 
6.7%
d 584074
 
6.7%
Other values (5) 762
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
P 584074
50.0%
S 584074
50.0%
M 127
 
< 0.1%
O 127
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 9931544
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 2920624
29.4%
r 1168275
11.8%
i 584328
 
5.9%
n 584328
 
5.9%
c 584201
 
5.9%
s 584201
 
5.9%
v 584201
 
5.9%
m 584074
 
5.9%
P 584074
 
5.9%
p 584074
 
5.9%
Other values (9) 1169164
11.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9931544
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 2920624
29.4%
r 1168275
11.8%
i 584328
 
5.9%
n 584328
 
5.9%
c 584201
 
5.9%
s 584201
 
5.9%
v 584201
 
5.9%
m 584074
 
5.9%
P 584074
 
5.9%
p 584074
 
5.9%
Other values (9) 1169164
11.8%

occurrenceID
Text

Unique 

Distinct584201
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-14T11:50:19.877197image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length63
Median length63
Mean length63
Min length63

Characters and Unicode

Total characters36804663
Distinct characters26
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique584201 ?
Unique (%)100.0%

Sample

1st rowhttp://n2t.net/ark:/65665/3000ac9b1-ec0b-4be2-939f-464ad355cc84
2nd rowhttp://n2t.net/ark:/65665/30010adfb-58e1-4e98-8d39-ee055b3463fa
3rd rowhttp://n2t.net/ark:/65665/30012ab17-d2a1-470c-a774-540bc6cffb00
4th rowhttp://n2t.net/ark:/65665/3ec02d332-deb7-4b55-ba3d-5a5d6ca577c9
5th rowhttp://n2t.net/ark:/65665/3ec19a125-2484-4fa3-b6b7-7d87199a6994
ValueCountFrequency (%)
http://n2t.net/ark:/65665/3000ac9b1-ec0b-4be2-939f-464ad355cc84 1
 
< 0.1%
http://n2t.net/ark:/65665/3ec19a125-2484-4fa3-b6b7-7d87199a6994 1
 
< 0.1%
http://n2t.net/ark:/65665/3ed02751f-656c-458c-80fa-90bf891a2063 1
 
< 0.1%
http://n2t.net/ark:/65665/3eced04ac-39a4-455a-85e7-7cb0b4299f6b 1
 
< 0.1%
http://n2t.net/ark:/65665/303348f04-82b4-456c-be8d-764af3205229 1
 
< 0.1%
http://n2t.net/ark:/65665/3008b1b21-05b1-4e8d-b34c-1e3a96daecf7 1
 
< 0.1%
http://n2t.net/ark:/65665/30012ab17-d2a1-470c-a774-540bc6cffb00 1
 
< 0.1%
http://n2t.net/ark:/65665/3ec02d332-deb7-4b55-ba3d-5a5d6ca577c9 1
 
< 0.1%
http://n2t.net/ark:/65665/3006575b6-ca0a-42bd-b75d-3241cc3e332d 1
 
< 0.1%
http://n2t.net/ark:/65665/3ed66e63b-4fff-4639-8abf-a635d31dd047 1
 
< 0.1%
Other values (584191) 584191
> 99.9%
2025-01-14T11:50:20.259457image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 2921005
 
7.9%
6 2847614
 
7.7%
- 2336804
 
6.3%
t 2336804
 
6.3%
5 2265995
 
6.2%
a 1826256
 
5.0%
e 1681096
 
4.6%
2 1680524
 
4.6%
3 1680017
 
4.6%
4 1678083
 
4.6%
Other values (16) 15550465
42.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 15919515
43.3%
Lowercase Letter 13874736
37.7%
Other Punctuation 4673608
 
12.7%
Dash Punctuation 2336804
 
6.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 2336804
16.8%
a 1826256
13.2%
e 1681096
12.1%
b 1241364
8.9%
n 1168402
8.4%
c 1094913
7.9%
f 1094889
7.9%
d 1094208
7.9%
k 584201
 
4.2%
r 584201
 
4.2%
Other values (2) 1168402
8.4%
Decimal Number
ValueCountFrequency (%)
6 2847614
17.9%
5 2265995
14.2%
2 1680524
10.6%
3 1680017
10.6%
4 1678083
10.5%
9 1244007
7.8%
8 1240305
7.8%
1 1096638
 
6.9%
7 1094431
 
6.9%
0 1091901
 
6.9%
Other Punctuation
ValueCountFrequency (%)
/ 2921005
62.5%
: 1168402
 
25.0%
. 584201
 
12.5%
Dash Punctuation
ValueCountFrequency (%)
- 2336804
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 22929927
62.3%
Latin 13874736
37.7%

Most frequent character per script

Common
ValueCountFrequency (%)
/ 2921005
12.7%
6 2847614
12.4%
- 2336804
10.2%
5 2265995
9.9%
2 1680524
7.3%
3 1680017
7.3%
4 1678083
7.3%
9 1244007
 
5.4%
8 1240305
 
5.4%
: 1168402
 
5.1%
Other values (4) 3867171
16.9%
Latin
ValueCountFrequency (%)
t 2336804
16.8%
a 1826256
13.2%
e 1681096
12.1%
b 1241364
8.9%
n 1168402
8.4%
c 1094913
7.9%
f 1094889
7.9%
d 1094208
7.9%
k 584201
 
4.2%
r 584201
 
4.2%
Other values (2) 1168402
8.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 36804663
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 2921005
 
7.9%
6 2847614
 
7.7%
- 2336804
 
6.3%
t 2336804
 
6.3%
5 2265995
 
6.2%
a 1826256
 
5.0%
e 1681096
 
4.6%
2 1680524
 
4.6%
3 1680017
 
4.6%
4 1678083
 
4.6%
Other values (16) 15550465
42.3%

catalogNumber
Text

Unique 

Distinct584201
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-14T11:50:20.646115image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length11
Mean length10.93256944
Min length6

Characters and Unicode

Total characters6386818
Distinct characters27
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique584201 ?
Unique (%)100.0%

Sample

1st rowUSNM 231889
2nd rowUSNM 487703
3rd rowUSNM 297347
4th rowUSNM 322261
5th rowUSNM 319170
ValueCountFrequency (%)
usnm 584201
49.5%
herp 5833
 
0.5%
tissue 5706
 
0.5%
image 127
 
< 0.1%
2847 3
 
< 0.1%
2877 3
 
< 0.1%
2872 3
 
< 0.1%
2940 3
 
< 0.1%
2715 3
 
< 0.1%
9 3
 
< 0.1%
Other values (581072) 584183
49.5%
2025-01-14T11:50:21.086268image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
595867
 
9.3%
U 584201
 
9.1%
N 584201
 
9.1%
M 584201
 
9.1%
S 584201
 
9.1%
4 393545
 
6.2%
2 393142
 
6.2%
3 392798
 
6.2%
1 391284
 
6.1%
5 383581
 
6.0%
Other values (17) 1499797
23.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3395944
53.2%
Uppercase Letter 2348470
36.8%
Space Separator 595867
 
9.3%
Lowercase Letter 46537
 
0.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 393545
11.6%
2 393142
11.6%
3 392798
11.6%
1 391284
11.5%
5 383581
11.3%
6 292686
8.6%
7 291064
8.6%
8 290326
8.5%
9 285200
8.4%
0 282318
8.3%
Lowercase Letter
ValueCountFrequency (%)
e 11666
25.1%
s 11412
24.5%
r 5833
12.5%
p 5833
12.5%
i 5706
12.3%
u 5706
12.3%
m 127
 
0.3%
a 127
 
0.3%
g 127
 
0.3%
Uppercase Letter
ValueCountFrequency (%)
U 584201
24.9%
N 584201
24.9%
M 584201
24.9%
S 584201
24.9%
H 5833
 
0.2%
T 5706
 
0.2%
I 127
 
< 0.1%
Space Separator
ValueCountFrequency (%)
595867
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3991811
62.5%
Latin 2395007
37.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 584201
24.4%
N 584201
24.4%
M 584201
24.4%
S 584201
24.4%
e 11666
 
0.5%
s 11412
 
0.5%
H 5833
 
0.2%
r 5833
 
0.2%
p 5833
 
0.2%
T 5706
 
0.2%
Other values (6) 11920
 
0.5%
Common
ValueCountFrequency (%)
595867
14.9%
4 393545
9.9%
2 393142
9.8%
3 392798
9.8%
1 391284
9.8%
5 383581
9.6%
6 292686
7.3%
7 291064
7.3%
8 290326
7.3%
9 285200
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6386818
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
595867
 
9.3%
U 584201
 
9.1%
N 584201
 
9.1%
M 584201
 
9.1%
S 584201
 
9.1%
4 393545
 
6.2%
2 393142
 
6.2%
3 392798
 
6.2%
1 391284
 
6.1%
5 383581
 
6.0%
Other values (17) 1499797
23.5%

recordNumber
Text

Missing 

Distinct273
Distinct (%)98.9%
Missing583925
Missing (%)> 99.9%
Memory size4.5 MiB
2025-01-14T11:50:21.271116image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length9
Mean length8.460144928
Min length1

Characters and Unicode

Total characters2335
Distinct characters19
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique271 ?
Unique (%)98.2%

Sample

1st rowRWM 20004
2nd rowRWM 19953
3rd rowRWM 19978
4th rowRWM 19932
5th rowRWM 19955
ValueCountFrequency (%)
rwm 182
33.2%
gmu 74
 
13.5%
lc 15
 
2.7%
8 3
 
0.5%
19897 2
 
0.4%
19895 1
 
0.2%
19926 1
 
0.2%
2430 1
 
0.2%
19973 1
 
0.2%
19925 1
 
0.2%
Other values (267) 267
48.7%
2025-01-14T11:50:21.517134image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
272
11.6%
9 260
11.1%
M 257
11.0%
0 245
10.5%
1 190
8.1%
W 182
7.8%
R 182
7.8%
2 165
7.1%
3 95
 
4.1%
G 75
 
3.2%
Other values (9) 412
17.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1262
54.0%
Uppercase Letter 801
34.3%
Space Separator 272
 
11.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9 260
20.6%
0 245
19.4%
1 190
15.1%
2 165
13.1%
3 95
 
7.5%
7 71
 
5.6%
6 63
 
5.0%
4 62
 
4.9%
8 57
 
4.5%
5 54
 
4.3%
Uppercase Letter
ValueCountFrequency (%)
M 257
32.1%
W 182
22.7%
R 182
22.7%
G 75
 
9.4%
U 74
 
9.2%
C 15
 
1.9%
L 15
 
1.9%
D 1
 
0.1%
Space Separator
ValueCountFrequency (%)
272
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1534
65.7%
Latin 801
34.3%

Most frequent character per script

Common
ValueCountFrequency (%)
272
17.7%
9 260
16.9%
0 245
16.0%
1 190
12.4%
2 165
10.8%
3 95
 
6.2%
7 71
 
4.6%
6 63
 
4.1%
4 62
 
4.0%
8 57
 
3.7%
Latin
ValueCountFrequency (%)
M 257
32.1%
W 182
22.7%
R 182
22.7%
G 75
 
9.4%
U 74
 
9.2%
C 15
 
1.9%
L 15
 
1.9%
D 1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2335
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
272
11.6%
9 260
11.1%
M 257
11.0%
0 245
10.5%
1 190
8.1%
W 182
7.8%
R 182
7.8%
2 165
7.1%
3 95
 
4.1%
G 75
 
3.2%
Other values (9) 412
17.6%
Distinct158
Distinct (%)< 0.1%
Missing4
Missing (%)< 0.1%
Memory size4.5 MiB
2025-01-14T11:50:21.620999image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length1
Mean length1.004863086
Min length1

Characters and Unicode

Total characters587038
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique51 ?
Unique (%)< 0.1%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1
ValueCountFrequency (%)
1 576101
98.6%
2 1312
 
0.2%
0 1007
 
0.2%
3 830
 
0.1%
5 523
 
0.1%
4 522
 
0.1%
6 386
 
0.1%
7 339
 
0.1%
8 271
 
< 0.1%
10 257
 
< 0.1%
Other values (148) 2649
 
0.5%
2025-01-14T11:50:21.773243image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 577649
98.4%
2 2199
 
0.4%
0 2065
 
0.4%
3 1313
 
0.2%
5 1043
 
0.2%
4 852
 
0.1%
6 611
 
0.1%
7 518
 
0.1%
8 428
 
0.1%
9 360
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 587038
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 577649
98.4%
2 2199
 
0.4%
0 2065
 
0.4%
3 1313
 
0.2%
5 1043
 
0.2%
4 852
 
0.1%
6 611
 
0.1%
7 518
 
0.1%
8 428
 
0.1%
9 360
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 587038
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 577649
98.4%
2 2199
 
0.4%
0 2065
 
0.4%
3 1313
 
0.2%
5 1043
 
0.2%
4 852
 
0.1%
6 611
 
0.1%
7 518
 
0.1%
8 428
 
0.1%
9 360
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 587038
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 577649
98.4%
2 2199
 
0.4%
0 2065
 
0.4%
3 1313
 
0.2%
5 1043
 
0.2%
4 852
 
0.1%
6 611
 
0.1%
7 518
 
0.1%
8 428
 
0.1%
9 360
 
0.1%

sex
Text

Missing 

Distinct8
Distinct (%)< 0.1%
Missing527948
Missing (%)90.4%
Memory size4.5 MiB
2025-01-14T11:50:21.822448image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length4
Mean length5.299326258
Min length4

Characters and Unicode

Total characters298103
Distinct characters22
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowMale
2nd rowMale
3rd rowFemale
4th rowMale
5th rowFemale
ValueCountFrequency (%)
male 29804
49.4%
female 22454
37.2%
sex 3994
 
6.6%
unknown 3994
 
6.6%
108
 
0.2%
hermaphrodite 1
 
< 0.1%
2025-01-14T11:50:21.919690image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 78708
26.4%
a 52259
17.5%
l 52258
17.5%
M 29759
 
10.0%
m 22500
 
7.5%
F 22437
 
7.5%
n 11982
 
4.0%
4102
 
1.4%
o 3995
 
1.3%
w 3994
 
1.3%
Other values (12) 16109
 
5.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 237703
79.7%
Uppercase Letter 56190
 
18.8%
Space Separator 4102
 
1.4%
Other Punctuation 108
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 78708
33.1%
a 52259
22.0%
l 52258
22.0%
m 22500
 
9.5%
n 11982
 
5.0%
o 3995
 
1.7%
w 3994
 
1.7%
k 3994
 
1.7%
u 3994
 
1.7%
x 3994
 
1.7%
Other values (7) 25
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
M 29759
53.0%
F 22437
39.9%
S 3994
 
7.1%
Space Separator
ValueCountFrequency (%)
4102
100.0%
Other Punctuation
ValueCountFrequency (%)
? 108
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 293893
98.6%
Common 4210
 
1.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 78708
26.8%
a 52259
17.8%
l 52258
17.8%
M 29759
 
10.1%
m 22500
 
7.7%
F 22437
 
7.6%
n 11982
 
4.1%
o 3995
 
1.4%
w 3994
 
1.4%
k 3994
 
1.4%
Other values (10) 12007
 
4.1%
Common
ValueCountFrequency (%)
4102
97.4%
? 108
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 298103
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 78708
26.4%
a 52259
17.5%
l 52258
17.5%
M 29759
 
10.0%
m 22500
 
7.5%
F 22437
 
7.5%
n 11982
 
4.0%
4102
 
1.4%
o 3995
 
1.3%
w 3994
 
1.3%
Other values (12) 16109
 
5.4%

lifeStage
Text

Missing 

Distinct240
Distinct (%)0.5%
Missing539845
Missing (%)92.4%
Memory size4.5 MiB
2025-01-14T11:50:22.011662image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length62
Median length43
Mean length7.309247903
Min length3

Characters and Unicode

Total characters324209
Distinct characters63
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique91 ?
Unique (%)0.2%

Sample

1st rowMetamorph
2nd rowLarva
3rd rowEggs
4th rowLarva
5th rowMetamorph
ValueCountFrequency (%)
juvenile 20305
43.8%
larva 6310
 
13.6%
larvae 5451
 
11.8%
adult 3825
 
8.3%
hatchling 2086
 
4.5%
metamorph 1344
 
2.9%
embryo 905
 
2.0%
eggs 812
 
1.8%
neonate 578
 
1.2%
subadult 530
 
1.1%
Other values (117) 4162
 
9.0%
2025-01-14T11:50:22.172585image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 51203
15.8%
v 32241
9.9%
a 30579
9.4%
l 27879
8.6%
u 25504
7.9%
n 24522
7.6%
i 23249
 
7.2%
J 20353
 
6.3%
r 16052
 
5.0%
L 11780
 
3.6%
Other values (53) 60847
18.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 277884
85.7%
Uppercase Letter 43941
 
13.6%
Space Separator 1952
 
0.6%
Open Punctuation 122
 
< 0.1%
Close Punctuation 122
 
< 0.1%
Other Punctuation 87
 
< 0.1%
Decimal Number 61
 
< 0.1%
Dash Punctuation 35
 
< 0.1%
Math Symbol 5
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 51203
18.4%
v 32241
11.6%
a 30579
11.0%
l 27879
10.0%
u 25504
9.2%
n 24522
8.8%
i 23249
8.4%
r 16052
 
5.8%
t 10794
 
3.9%
d 5137
 
1.8%
Other values (14) 30724
11.1%
Uppercase Letter
ValueCountFrequency (%)
J 20353
46.3%
L 11780
26.8%
A 3619
 
8.2%
E 2494
 
5.7%
H 2456
 
5.6%
M 1238
 
2.8%
N 692
 
1.6%
S 554
 
1.3%
P 382
 
0.9%
R 95
 
0.2%
Other values (10) 278
 
0.6%
Decimal Number
ValueCountFrequency (%)
2 26
42.6%
5 16
26.2%
1 6
 
9.8%
3 4
 
6.6%
4 4
 
6.6%
6 2
 
3.3%
8 1
 
1.6%
9 1
 
1.6%
0 1
 
1.6%
Other Punctuation
ValueCountFrequency (%)
, 31
35.6%
; 28
32.2%
/ 15
17.2%
? 10
 
11.5%
. 3
 
3.4%
Space Separator
ValueCountFrequency (%)
1952
100.0%
Open Punctuation
ValueCountFrequency (%)
( 122
100.0%
Close Punctuation
ValueCountFrequency (%)
) 122
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 35
100.0%
Math Symbol
ValueCountFrequency (%)
+ 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 321825
99.3%
Common 2384
 
0.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 51203
15.9%
v 32241
10.0%
a 30579
9.5%
l 27879
8.7%
u 25504
7.9%
n 24522
7.6%
i 23249
 
7.2%
J 20353
 
6.3%
r 16052
 
5.0%
L 11780
 
3.7%
Other values (34) 58463
18.2%
Common
ValueCountFrequency (%)
1952
81.9%
( 122
 
5.1%
) 122
 
5.1%
- 35
 
1.5%
, 31
 
1.3%
; 28
 
1.2%
2 26
 
1.1%
5 16
 
0.7%
/ 15
 
0.6%
? 10
 
0.4%
Other values (9) 27
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 324209
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 51203
15.8%
v 32241
9.9%
a 30579
9.4%
l 27879
8.6%
u 25504
7.9%
n 24522
7.6%
i 23249
 
7.2%
J 20353
 
6.3%
r 16052
 
5.0%
L 11780
 
3.6%
Other values (53) 60847
18.8%
Distinct31
Distinct (%)< 0.1%
Missing5684
Missing (%)1.0%
Memory size4.5 MiB
2025-01-14T11:50:22.228113image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length53
Median length7
Mean length7.117061383
Min length3

Characters and Unicode

Total characters4117341
Distinct characters26
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)< 0.1%

Sample

1st rowEthanol
2nd rowEthanol; Histological Material
3rd rowEthanol; Dry
4th rowEthanol
5th rowEthanol
ValueCountFrequency (%)
ethanol 553871
93.4%
dry 13058
 
2.2%
formalin 8143
 
1.4%
cleared 4474
 
0.8%
and 4474
 
0.8%
stained 4474
 
0.8%
histological 2058
 
0.3%
material 2058
 
0.3%
photograph 126
 
< 0.1%
sem 3
 
< 0.1%
2025-01-14T11:50:22.339046image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 581736
14.1%
l 572662
13.9%
n 570962
13.9%
o 566382
13.8%
t 562587
13.7%
h 554123
13.5%
E 553874
13.5%
r 27859
 
0.7%
i 18791
 
0.5%
e 15480
 
0.4%
Other values (16) 92885
 
2.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3511631
85.3%
Uppercase Letter 588271
 
14.3%
Space Separator 14223
 
0.3%
Other Punctuation 3216
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 581736
16.6%
l 572662
16.3%
n 570962
16.3%
o 566382
16.1%
t 562587
16.0%
h 554123
15.8%
r 27859
 
0.8%
i 18791
 
0.5%
e 15480
 
0.4%
d 13422
 
0.4%
Other values (6) 27627
 
0.8%
Uppercase Letter
ValueCountFrequency (%)
E 553874
94.2%
D 13058
 
2.2%
F 8143
 
1.4%
S 4477
 
0.8%
C 4474
 
0.8%
M 2061
 
0.4%
H 2058
 
0.3%
P 126
 
< 0.1%
Space Separator
ValueCountFrequency (%)
14223
100.0%
Other Punctuation
ValueCountFrequency (%)
; 3216
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4099902
99.6%
Common 17439
 
0.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 581736
14.2%
l 572662
14.0%
n 570962
13.9%
o 566382
13.8%
t 562587
13.7%
h 554123
13.5%
E 553874
13.5%
r 27859
 
0.7%
i 18791
 
0.5%
e 15480
 
0.4%
Other values (14) 75446
 
1.8%
Common
ValueCountFrequency (%)
14223
81.6%
; 3216
 
18.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4117341
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 581736
14.1%
l 572662
13.9%
n 570962
13.9%
o 566382
13.8%
t 562587
13.7%
h 554123
13.5%
E 553874
13.5%
r 27859
 
0.7%
i 18791
 
0.5%
e 15480
 
0.4%
Other values (16) 92885
 
2.3%

associatedMedia
Text

Missing 

Distinct4962
Distinct (%)96.4%
Missing579054
Missing (%)99.1%
Memory size4.5 MiB
2025-01-14T11:50:22.473170image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length299
Median length279
Mean length68.65785895
Min length48

Characters and Unicode

Total characters353382
Distinct characters31
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4905 ?
Unique (%)95.3%

Sample

1st rowhttps://collections.nmnh.si.edu/media/?i=14894414; 14895830; 14895831; 14895832; 14895833
2nd rowhttps://collections.nmnh.si.edu/media/?i=14589063; 14589068
3rd rowhttps://collections.nmnh.si.edu/media/?i=14894289; 14894859; 14894860
4th rowhttps://collections.nmnh.si.edu/media/?i=6000993; 6000994; 6000992
5th rowhttps://collections.nmnh.si.edu/media/?i=16155167; 16155168; 16155169; 16155170
ValueCountFrequency (%)
https://collections.nmnh.si.edu/media/?i=14580337 28
 
0.2%
10295705 27
 
0.2%
https://collections.nmnh.si.edu/media/?i=10389334 27
 
0.2%
10169077 19
 
0.1%
10153185 18
 
0.1%
https://collections.nmnh.si.edu/media/?i=16688871 13
 
0.1%
6001652 12
 
0.1%
https://collections.nmnh.si.edu/media/?i=10690530 11
 
0.1%
10690531 11
 
0.1%
10295177 10
 
0.1%
Other values (14831) 15298
98.9%
2025-01-14T11:50:22.695873image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 22331
 
6.3%
/ 20588
 
5.8%
i 20588
 
5.8%
0 17642
 
5.0%
t 15441
 
4.4%
s 15441
 
4.4%
e 15441
 
4.4%
n 15441
 
4.4%
. 15441
 
4.4%
2 14443
 
4.1%
Other values (21) 180585
51.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 159557
45.2%
Decimal Number 121701
34.4%
Other Punctuation 56650
 
16.0%
Space Separator 10327
 
2.9%
Math Symbol 5147
 
1.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 20588
12.9%
t 15441
9.7%
s 15441
9.7%
e 15441
9.7%
n 15441
9.7%
h 10294
 
6.5%
d 10294
 
6.5%
m 10294
 
6.5%
l 10294
 
6.5%
o 10294
 
6.5%
Other values (4) 25735
16.1%
Decimal Number
ValueCountFrequency (%)
1 22331
18.3%
0 17642
14.5%
2 14443
11.9%
6 12036
9.9%
4 11244
9.2%
8 9476
7.8%
9 9312
7.7%
3 9025
7.4%
5 8430
 
6.9%
7 7762
 
6.4%
Other Punctuation
ValueCountFrequency (%)
/ 20588
36.3%
. 15441
27.3%
; 10327
18.2%
? 5147
 
9.1%
: 5147
 
9.1%
Space Separator
ValueCountFrequency (%)
10327
100.0%
Math Symbol
ValueCountFrequency (%)
= 5147
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 193825
54.8%
Latin 159557
45.2%

Most frequent character per script

Common
ValueCountFrequency (%)
1 22331
11.5%
/ 20588
10.6%
0 17642
 
9.1%
. 15441
 
8.0%
2 14443
 
7.5%
6 12036
 
6.2%
4 11244
 
5.8%
10327
 
5.3%
; 10327
 
5.3%
8 9476
 
4.9%
Other values (7) 49970
25.8%
Latin
ValueCountFrequency (%)
i 20588
12.9%
t 15441
9.7%
s 15441
9.7%
e 15441
9.7%
n 15441
9.7%
h 10294
 
6.5%
d 10294
 
6.5%
m 10294
 
6.5%
l 10294
 
6.5%
o 10294
 
6.5%
Other values (4) 25735
16.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 353382
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 22331
 
6.3%
/ 20588
 
5.8%
i 20588
 
5.8%
0 17642
 
5.0%
t 15441
 
4.4%
s 15441
 
4.4%
e 15441
 
4.4%
n 15441
 
4.4%
. 15441
 
4.4%
2 14443
 
4.1%
Other values (21) 180585
51.1%

associatedSequences
Text

Missing 

Distinct719
Distinct (%)99.7%
Missing583480
Missing (%)99.9%
Memory size4.5 MiB
2025-01-14T11:50:22.775682image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length699
Median length99
Mean length112.1983356
Min length49

Characters and Unicode

Total characters80895
Distinct characters55
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique717 ?
Unique (%)99.4%

Sample

1st rowhttps://www.ncbi.nlm.nih.gov/gquery?term=AF199141|https://www.ncbi.nlm.nih.gov/gquery?term=AF199204
2nd rowhttps://www.ncbi.nlm.nih.gov/gquery?term=OM928184|https://www.ncbi.nlm.nih.gov/gquery?term=OM943246
3rd rowhttps://www.ncbi.nlm.nih.gov/gquery?term=JQ914700
4th rowhttps://www.ncbi.nlm.nih.gov/gquery?term=FJ613461
5th rowhttps://www.ncbi.nlm.nih.gov/gquery?term=FJ766602|https://www.ncbi.nlm.nih.gov/gquery?term=FJ784443
ValueCountFrequency (%)
https://www.ncbi.nlm.nih.gov/gquery?term=jn112709|https://www.ncbi.nlm.nih.gov/gquery?term=jn112771|https://www.ncbi.nlm.nih.gov/gquery?term=jn112642 2
 
0.3%
https://www.ncbi.nlm.nih.gov/gquery?term=ay604497 2
 
0.3%
https://www.ncbi.nlm.nih.gov/gquery?term=fj976636 1
 
0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=jn377389|https://www.ncbi.nlm.nih.gov/gquery?term=jn377393|https://www.ncbi.nlm.nih.gov/gquery?term=jn377405 1
 
0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=kc129216|https://www.ncbi.nlm.nih.gov/gquery?term=kc129324 1
 
0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=ay604512 1
 
0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=fj766829|https://www.ncbi.nlm.nih.gov/gquery?term=fj784465 1
 
0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=om928184|https://www.ncbi.nlm.nih.gov/gquery?term=om943246 1
 
0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=jq914700 1
 
0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=fj613461 1
 
0.1%
Other values (709) 709
98.3%
2025-01-14T11:50:22.907572image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 6533
 
8.1%
t 4896
 
6.1%
/ 4896
 
6.1%
w 4896
 
6.1%
n 4896
 
6.1%
h 3264
 
4.0%
r 3264
 
4.0%
i 3264
 
4.0%
e 3264
 
4.0%
m 3264
 
4.0%
Other values (45) 38458
47.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 50592
62.5%
Other Punctuation 14693
 
18.2%
Decimal Number 9801
 
12.1%
Uppercase Letter 3266
 
4.0%
Math Symbol 2543
 
3.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
J 653
20.0%
F 619
19.0%
M 450
13.8%
K 434
13.3%
A 177
 
5.4%
Y 150
 
4.6%
Q 129
 
3.9%
H 104
 
3.2%
N 86
 
2.6%
O 76
 
2.3%
Other values (10) 388
11.9%
Lowercase Letter
ValueCountFrequency (%)
t 4896
 
9.7%
w 4896
 
9.7%
n 4896
 
9.7%
h 3264
 
6.5%
r 3264
 
6.5%
i 3264
 
6.5%
e 3264
 
6.5%
m 3264
 
6.5%
g 3264
 
6.5%
q 1632
 
3.2%
Other values (9) 14688
29.0%
Decimal Number
ValueCountFrequency (%)
4 1506
15.4%
6 1267
12.9%
7 1220
12.4%
8 1087
11.1%
3 960
9.8%
1 840
8.6%
5 786
8.0%
2 763
7.8%
9 763
7.8%
0 609
6.2%
Other Punctuation
ValueCountFrequency (%)
. 6533
44.5%
/ 4896
33.3%
? 1632
 
11.1%
: 1632
 
11.1%
Math Symbol
ValueCountFrequency (%)
= 1632
64.2%
| 911
35.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 53858
66.6%
Common 27037
33.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 4896
 
9.1%
w 4896
 
9.1%
n 4896
 
9.1%
h 3264
 
6.1%
r 3264
 
6.1%
i 3264
 
6.1%
e 3264
 
6.1%
m 3264
 
6.1%
g 3264
 
6.1%
q 1632
 
3.0%
Other values (29) 17954
33.3%
Common
ValueCountFrequency (%)
. 6533
24.2%
/ 4896
18.1%
= 1632
 
6.0%
? 1632
 
6.0%
: 1632
 
6.0%
4 1506
 
5.6%
6 1267
 
4.7%
7 1220
 
4.5%
8 1087
 
4.0%
3 960
 
3.6%
Other values (6) 4672
17.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 80895
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 6533
 
8.1%
t 4896
 
6.1%
/ 4896
 
6.1%
w 4896
 
6.1%
n 4896
 
6.1%
h 3264
 
4.0%
r 3264
 
4.0%
i 3264
 
4.0%
e 3264
 
4.0%
m 3264
 
4.0%
Other values (45) 38458
47.5%

occurrenceRemarks
Text

Missing 

Distinct5339
Distinct (%)20.1%
Missing557618
Missing (%)95.4%
Memory size4.5 MiB
2025-01-14T11:50:23.084265image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1294
Median length381
Mean length66.70947598
Min length3

Characters and Unicode

Total characters1773338
Distinct characters91
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3351 ?
Unique (%)12.6%

Sample

1st rowCollected from vegetation removal plot (Cocolob 2) in coastal strand Cocolobo uvifera forest, ca. 10 m inland from beach.
2nd rowCollected in roadside ditch in gum/bay swamp. Water depth: 10-40 cm.
3rd rowComplete clutch of eggs removed from the ovaries of a female (Total Length: 57 inches) collected along wooded road.
4th rowCollected on surface at night.
5th rowCollected above and below the falls, south of the creek.
ValueCountFrequency (%)
collected 21028
 
7.1%
in 15429
 
5.2%
of 11658
 
3.9%
the 11088
 
3.7%
on 10611
 
3.6%
from 7596
 
2.6%
and 5597
 
1.9%
at 5284
 
1.8%
area 4127
 
1.4%
road 4049
 
1.4%
Other values (6088) 200792
67.5%
2025-01-14T11:50:23.350097image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
270676
15.3%
e 160711
 
9.1%
o 140496
 
7.9%
a 114427
 
6.5%
t 108900
 
6.1%
l 98347
 
5.5%
n 89158
 
5.0%
r 81415
 
4.6%
d 76949
 
4.3%
i 72320
 
4.1%
Other values (81) 559939
31.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1313680
74.1%
Space Separator 270676
 
15.3%
Uppercase Letter 64926
 
3.7%
Decimal Number 55444
 
3.1%
Other Punctuation 51683
 
2.9%
Open Punctuation 5630
 
0.3%
Close Punctuation 5620
 
0.3%
Dash Punctuation 5566
 
0.3%
Math Symbol 113
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 160711
12.2%
o 140496
10.7%
a 114427
 
8.7%
t 108900
 
8.3%
l 98347
 
7.5%
n 89158
 
6.8%
r 81415
 
6.2%
d 76949
 
5.9%
i 72320
 
5.5%
s 64649
 
4.9%
Other values (23) 306308
23.3%
Uppercase Letter
ValueCountFrequency (%)
C 24437
37.6%
P 4884
 
7.5%
N 3946
 
6.1%
A 3756
 
5.8%
S 3750
 
5.8%
T 2957
 
4.6%
R 2957
 
4.6%
M 2263
 
3.5%
F 1854
 
2.9%
H 1826
 
2.8%
Other values (16) 12296
18.9%
Other Punctuation
ValueCountFrequency (%)
. 35953
69.6%
, 7909
 
15.3%
: 3238
 
6.3%
" 1687
 
3.3%
; 1332
 
2.6%
' 566
 
1.1%
/ 486
 
0.9%
% 225
 
0.4%
# 199
 
0.4%
? 57
 
0.1%
Other values (2) 31
 
0.1%
Decimal Number
ValueCountFrequency (%)
1 12010
21.7%
0 9775
17.6%
2 7516
13.6%
9 5201
9.4%
8 4032
 
7.3%
5 3818
 
6.9%
3 3764
 
6.8%
7 3390
 
6.1%
6 3356
 
6.1%
4 2582
 
4.7%
Math Symbol
ValueCountFrequency (%)
= 100
88.5%
+ 7
 
6.2%
< 4
 
3.5%
> 2
 
1.8%
Open Punctuation
ValueCountFrequency (%)
( 5543
98.5%
[ 87
 
1.5%
Close Punctuation
ValueCountFrequency (%)
) 5533
98.5%
] 87
 
1.5%
Space Separator
ValueCountFrequency (%)
270676
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5566
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1378606
77.7%
Common 394732
 
22.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 160711
11.7%
o 140496
 
10.2%
a 114427
 
8.3%
t 108900
 
7.9%
l 98347
 
7.1%
n 89158
 
6.5%
r 81415
 
5.9%
d 76949
 
5.6%
i 72320
 
5.2%
s 64649
 
4.7%
Other values (49) 371234
26.9%
Common
ValueCountFrequency (%)
270676
68.6%
. 35953
 
9.1%
1 12010
 
3.0%
0 9775
 
2.5%
, 7909
 
2.0%
2 7516
 
1.9%
- 5566
 
1.4%
( 5543
 
1.4%
) 5533
 
1.4%
9 5201
 
1.3%
Other values (22) 29050
 
7.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1773305
> 99.9%
None 33
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
270676
15.3%
e 160711
 
9.1%
o 140496
 
7.9%
a 114427
 
6.5%
t 108900
 
6.1%
l 98347
 
5.5%
n 89158
 
5.0%
r 81415
 
4.6%
d 76949
 
4.3%
i 72320
 
4.1%
Other values (74) 559906
31.6%
None
ValueCountFrequency (%)
ö 14
42.4%
á 7
21.2%
é 5
 
15.2%
ó 2
 
6.1%
ü 2
 
6.1%
è 2
 
6.1%
ñ 1
 
3.0%

fieldNumber
Text

Missing 

Distinct2
Distinct (%)25.0%
Missing584193
Missing (%)> 99.9%
Memory size4.5 MiB
2025-01-14T11:50:23.405622image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length6
Mean length6.125
Min length6

Characters and Unicode

Total characters49
Distinct characters8
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)12.5%

Sample

1st row83-012
2nd row83-012
3rd row83-012
4th row83-012
5th row83-012
ValueCountFrequency (%)
83-012 7
87.5%
83-024a 1
 
12.5%
2025-01-14T11:50:23.503274image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8 8
16.3%
3 8
16.3%
- 8
16.3%
0 8
16.3%
2 8
16.3%
1 7
14.3%
4 1
 
2.0%
A 1
 
2.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 40
81.6%
Dash Punctuation 8
 
16.3%
Uppercase Letter 1
 
2.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
8 8
20.0%
3 8
20.0%
0 8
20.0%
2 8
20.0%
1 7
17.5%
4 1
 
2.5%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%
Uppercase Letter
ValueCountFrequency (%)
A 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 48
98.0%
Latin 1
 
2.0%

Most frequent character per script

Common
ValueCountFrequency (%)
8 8
16.7%
3 8
16.7%
- 8
16.7%
0 8
16.7%
2 8
16.7%
1 7
14.6%
4 1
 
2.1%
Latin
ValueCountFrequency (%)
A 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 49
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8 8
16.3%
3 8
16.3%
- 8
16.3%
0 8
16.3%
2 8
16.3%
1 7
14.3%
4 1
 
2.0%
A 1
 
2.0%

eventDate
Text

Missing 

Distinct31354
Distinct (%)5.7%
Missing37781
Missing (%)6.5%
Memory size4.5 MiB
2025-01-14T11:50:23.706155image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length10
Mean length9.988499689
Min length4

Characters and Unicode

Total characters5457916
Distinct characters16
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7237 ?
Unique (%)1.3%

Sample

1st row1972-02-01/1972-02-03
2nd row1971-09-03
3rd row1992-10-15
4th row1992-06-24
5th row1998-09-03
ValueCountFrequency (%)
1973-09-22 723
 
0.1%
1883 697
 
0.1%
1998-10-09 690
 
0.1%
1935 684
 
0.1%
1971-08-16 610
 
0.1%
1966-04-11 579
 
0.1%
1970-06-19 564
 
0.1%
1976-10-03 540
 
0.1%
1971-07-31 521
 
0.1%
1969-06-27 472
 
0.1%
Other values (31319) 540908
98.9%
2025-01-14T11:50:23.984481image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 1060453
19.4%
1 993873
18.2%
0 815551
14.9%
9 736835
13.5%
2 357571
 
6.6%
7 295686
 
5.4%
6 289115
 
5.3%
8 288256
 
5.3%
3 212557
 
3.9%
5 209699
 
3.8%
Other values (6) 198320
 
3.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4378606
80.2%
Dash Punctuation 1060453
 
19.4%
Other Punctuation 17721
 
0.3%
Space Separator 568
 
< 0.1%
Lowercase Letter 568
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 993873
22.7%
0 815551
18.6%
9 736835
16.8%
2 357571
 
8.2%
7 295686
 
6.8%
6 289115
 
6.6%
8 288256
 
6.6%
3 212557
 
4.9%
5 209699
 
4.8%
4 179463
 
4.1%
Other Punctuation
ValueCountFrequency (%)
/ 17679
99.8%
, 42
 
0.2%
Lowercase Letter
ValueCountFrequency (%)
o 284
50.0%
r 284
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 1060453
100.0%
Space Separator
ValueCountFrequency (%)
568
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5457348
> 99.9%
Latin 568
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
- 1060453
19.4%
1 993873
18.2%
0 815551
14.9%
9 736835
13.5%
2 357571
 
6.6%
7 295686
 
5.4%
6 289115
 
5.3%
8 288256
 
5.3%
3 212557
 
3.9%
5 209699
 
3.8%
Other values (4) 197752
 
3.6%
Latin
ValueCountFrequency (%)
o 284
50.0%
r 284
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5457916
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 1060453
19.4%
1 993873
18.2%
0 815551
14.9%
9 736835
13.5%
2 357571
 
6.6%
7 295686
 
5.4%
6 289115
 
5.3%
8 288256
 
5.3%
3 212557
 
3.9%
5 209699
 
3.8%
Other values (6) 198320
 
3.6%

startDayOfYear
Text

Missing 

Distinct366
Distinct (%)0.1%
Missing55728
Missing (%)9.5%
Memory size4.5 MiB
2025-01-14T11:50:24.189438image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.785824441
Min length1

Characters and Unicode

Total characters1472233
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row32
2nd row246
3rd row289
4th row176
5th row246
ValueCountFrequency (%)
151 5188
 
1.0%
212 5134
 
1.0%
243 4767
 
0.9%
181 4630
 
0.9%
120 3680
 
0.7%
91 3127
 
0.6%
90 2993
 
0.6%
227 2917
 
0.6%
152 2853
 
0.5%
230 2852
 
0.5%
Other values (356) 490332
92.8%
2025-01-14T11:50:24.455813image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 312432
21.2%
2 283563
19.3%
3 159914
10.9%
4 107096
 
7.3%
0 106264
 
7.2%
5 103396
 
7.0%
8 101181
 
6.9%
9 100780
 
6.8%
6 99795
 
6.8%
7 97812
 
6.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1472233
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 312432
21.2%
2 283563
19.3%
3 159914
10.9%
4 107096
 
7.3%
0 106264
 
7.2%
5 103396
 
7.0%
8 101181
 
6.9%
9 100780
 
6.8%
6 99795
 
6.8%
7 97812
 
6.6%

Most occurring scripts

ValueCountFrequency (%)
Common 1472233
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 312432
21.2%
2 283563
19.3%
3 159914
10.9%
4 107096
 
7.3%
0 106264
 
7.2%
5 103396
 
7.0%
8 101181
 
6.9%
9 100780
 
6.8%
6 99795
 
6.8%
7 97812
 
6.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1472233
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 312432
21.2%
2 283563
19.3%
3 159914
10.9%
4 107096
 
7.3%
0 106264
 
7.2%
5 103396
 
7.0%
8 101181
 
6.9%
9 100780
 
6.8%
6 99795
 
6.8%
7 97812
 
6.6%

endDayOfYear
Text

Missing 

Distinct366
Distinct (%)0.1%
Missing55637
Missing (%)9.5%
Memory size4.5 MiB
2025-01-14T11:50:24.658725image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.786563217
Min length1

Characters and Unicode

Total characters1472877
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row34
2nd row246
3rd row289
4th row176
5th row246
ValueCountFrequency (%)
151 5257
 
1.0%
212 5068
 
1.0%
243 4867
 
0.9%
181 4612
 
0.9%
120 3515
 
0.7%
91 3266
 
0.6%
230 3042
 
0.6%
227 2924
 
0.6%
59 2923
 
0.6%
90 2917
 
0.6%
Other values (356) 490173
92.7%
2025-01-14T11:50:24.924662image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 313155
21.3%
2 283562
19.3%
3 160654
10.9%
4 106856
 
7.3%
0 106039
 
7.2%
5 103988
 
7.1%
8 101330
 
6.9%
9 101260
 
6.9%
6 98792
 
6.7%
7 97241
 
6.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1472877
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 313155
21.3%
2 283562
19.3%
3 160654
10.9%
4 106856
 
7.3%
0 106039
 
7.2%
5 103988
 
7.1%
8 101330
 
6.9%
9 101260
 
6.9%
6 98792
 
6.7%
7 97241
 
6.6%

Most occurring scripts

ValueCountFrequency (%)
Common 1472877
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 313155
21.3%
2 283562
19.3%
3 160654
10.9%
4 106856
 
7.3%
0 106039
 
7.2%
5 103988
 
7.1%
8 101330
 
6.9%
9 101260
 
6.9%
6 98792
 
6.7%
7 97241
 
6.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1472877
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 313155
21.3%
2 283562
19.3%
3 160654
10.9%
4 106856
 
7.3%
0 106039
 
7.2%
5 103988
 
7.1%
8 101330
 
6.9%
9 101260
 
6.9%
6 98792
 
6.7%
7 97241
 
6.6%

year
Text

Missing 

Distinct184
Distinct (%)< 0.1%
Missing37781
Missing (%)6.5%
Memory size4.5 MiB
2025-01-14T11:50:25.101684image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters2185680
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)< 0.1%

Sample

1st row1972
2nd row1971
3rd row1992
4th row1992
5th row1998
ValueCountFrequency (%)
1971 17001
 
3.1%
1966 15984
 
2.9%
1969 15783
 
2.9%
1970 15631
 
2.9%
1976 15293
 
2.8%
1980 15182
 
2.8%
1979 14987
 
2.7%
1972 14413
 
2.6%
1961 12799
 
2.3%
1984 12649
 
2.3%
Other values (174) 396698
72.6%
2025-01-14T11:50:25.336968image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9 629501
28.8%
1 601542
27.5%
7 176597
 
8.1%
6 174519
 
8.0%
8 162711
 
7.4%
0 112997
 
5.2%
2 90398
 
4.1%
5 85543
 
3.9%
3 82534
 
3.8%
4 69338
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2185680
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9 629501
28.8%
1 601542
27.5%
7 176597
 
8.1%
6 174519
 
8.0%
8 162711
 
7.4%
0 112997
 
5.2%
2 90398
 
4.1%
5 85543
 
3.9%
3 82534
 
3.8%
4 69338
 
3.2%

Most occurring scripts

ValueCountFrequency (%)
Common 2185680
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
9 629501
28.8%
1 601542
27.5%
7 176597
 
8.1%
6 174519
 
8.0%
8 162711
 
7.4%
0 112997
 
5.2%
2 90398
 
4.1%
5 85543
 
3.9%
3 82534
 
3.8%
4 69338
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2185680
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9 629501
28.8%
1 601542
27.5%
7 176597
 
8.1%
6 174519
 
8.0%
8 162711
 
7.4%
0 112997
 
5.2%
2 90398
 
4.1%
5 85543
 
3.9%
3 82534
 
3.8%
4 69338
 
3.2%

month
Text

Missing 

Distinct12
Distinct (%)< 0.1%
Missing54300
Missing (%)9.3%
Memory size4.5 MiB
2025-01-14T11:50:25.398379image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length1
Mean length1.163641888
Min length1

Characters and Unicode

Total characters616615
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row9
3rd row10
4th row6
5th row9
ValueCountFrequency (%)
8 67636
12.8%
7 64391
12.2%
5 64348
12.1%
6 59630
11.3%
4 55731
10.5%
3 47041
8.9%
10 43052
8.1%
9 36733
6.9%
11 25771
 
4.9%
2 25643
 
4.8%
Other values (2) 39925
7.5%
2025-01-14T11:50:25.501606image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 134519
21.8%
8 67636
11.0%
7 64391
10.4%
5 64348
10.4%
6 59630
9.7%
4 55731
9.0%
3 47041
 
7.6%
2 43534
 
7.1%
0 43052
 
7.0%
9 36733
 
6.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 616615
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 134519
21.8%
8 67636
11.0%
7 64391
10.4%
5 64348
10.4%
6 59630
9.7%
4 55731
9.0%
3 47041
 
7.6%
2 43534
 
7.1%
0 43052
 
7.0%
9 36733
 
6.0%

Most occurring scripts

ValueCountFrequency (%)
Common 616615
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 134519
21.8%
8 67636
11.0%
7 64391
10.4%
5 64348
10.4%
6 59630
9.7%
4 55731
9.0%
3 47041
 
7.6%
2 43534
 
7.1%
0 43052
 
7.0%
9 36733
 
6.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 616615
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 134519
21.8%
8 67636
11.0%
7 64391
10.4%
5 64348
10.4%
6 59630
9.7%
4 55731
9.0%
3 47041
 
7.6%
2 43534
 
7.1%
0 43052
 
7.0%
9 36733
 
6.0%

day
Text

Missing 

Distinct31
Distinct (%)< 0.1%
Missing85891
Missing (%)14.7%
Memory size4.5 MiB
2025-01-14T11:50:25.573421image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length1.714494993
Min length1

Characters and Unicode

Total characters854350
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row3
3rd row15
4th row24
5th row3
ValueCountFrequency (%)
15 19564
 
3.9%
13 17837
 
3.6%
19 17383
 
3.5%
21 17361
 
3.5%
25 17217
 
3.5%
24 17005
 
3.4%
3 17001
 
3.4%
16 16842
 
3.4%
20 16734
 
3.4%
28 16674
 
3.3%
Other values (21) 324692
65.2%
2025-01-14T11:50:25.705565image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 227367
26.6%
2 211630
24.8%
3 75423
 
8.8%
5 52999
 
6.2%
8 48666
 
5.7%
9 48337
 
5.7%
0 48091
 
5.6%
6 47877
 
5.6%
4 47391
 
5.5%
7 46569
 
5.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 854350
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 227367
26.6%
2 211630
24.8%
3 75423
 
8.8%
5 52999
 
6.2%
8 48666
 
5.7%
9 48337
 
5.7%
0 48091
 
5.6%
6 47877
 
5.6%
4 47391
 
5.5%
7 46569
 
5.5%

Most occurring scripts

ValueCountFrequency (%)
Common 854350
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 227367
26.6%
2 211630
24.8%
3 75423
 
8.8%
5 52999
 
6.2%
8 48666
 
5.7%
9 48337
 
5.7%
0 48091
 
5.6%
6 47877
 
5.6%
4 47391
 
5.5%
7 46569
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 854350
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 227367
26.6%
2 211630
24.8%
3 75423
 
8.8%
5 52999
 
6.2%
8 48666
 
5.7%
9 48337
 
5.7%
0 48091
 
5.6%
6 47877
 
5.6%
4 47391
 
5.5%
7 46569
 
5.5%
Distinct42558
Distinct (%)7.3%
Missing51
Missing (%)< 0.1%
Memory size4.5 MiB
2025-01-14T11:50:25.889544image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length194
Median length11
Mean length12.14387743
Min length4

Characters and Unicode

Total characters7093846
Distinct characters74
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14192 ?
Unique (%)2.4%

Sample

1st row01-03 February 1972
2nd row3 Sep 1971
3rd row-- --- ----
4th row15 Oct 1992; 09:05-13:00 hrs
5th row24 Jun 1992; 10:30-11:40 hrs
ValueCountFrequency (%)
173374
 
9.4%
may 65316
 
3.5%
aug 63760
 
3.5%
jul 58386
 
3.2%
jun 53770
 
2.9%
apr 50984
 
2.8%
mar 43098
 
2.3%
oct 40349
 
2.2%
sep 34295
 
1.9%
hrs 24306
 
1.3%
Other values (3264) 1238022
67.1%
2025-01-14T11:50:26.162059image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1261510
17.8%
1 874532
 
12.3%
9 688315
 
9.7%
- 499756
 
7.0%
2 328876
 
4.6%
0 243409
 
3.4%
6 227222
 
3.2%
7 227024
 
3.2%
8 217953
 
3.1%
u 208644
 
2.9%
Other values (64) 2316605
32.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3263423
46.0%
Lowercase Letter 1431190
20.2%
Space Separator 1261510
 
17.8%
Uppercase Letter 543907
 
7.7%
Dash Punctuation 499756
 
7.0%
Other Punctuation 92897
 
1.3%
Open Punctuation 581
 
< 0.1%
Close Punctuation 581
 
< 0.1%
Format 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 208644
14.6%
r 158708
11.1%
a 157043
11.0%
e 120723
8.4%
n 97727
 
6.8%
p 94749
 
6.6%
y 81426
 
5.7%
l 78861
 
5.5%
g 78230
 
5.5%
c 71344
 
5.0%
Other values (16) 283735
19.8%
Uppercase Letter
ValueCountFrequency (%)
J 147803
27.2%
A 124847
23.0%
M 113882
20.9%
O 43248
 
8.0%
S 39048
 
7.2%
F 26562
 
4.9%
N 26070
 
4.8%
D 18361
 
3.4%
C 3404
 
0.6%
E 144
 
< 0.1%
Other values (13) 538
 
0.1%
Decimal Number
ValueCountFrequency (%)
1 874532
26.8%
9 688315
21.1%
2 328876
 
10.1%
0 243409
 
7.5%
6 227222
 
7.0%
7 227024
 
7.0%
8 217953
 
6.7%
3 176664
 
5.4%
5 152254
 
4.7%
4 127174
 
3.9%
Other Punctuation
ValueCountFrequency (%)
: 41924
45.1%
; 34825
37.5%
. 14987
 
16.1%
, 770
 
0.8%
/ 307
 
0.3%
' 46
 
< 0.1%
" 20
 
< 0.1%
? 18
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 580
99.8%
[ 1
 
0.2%
Close Punctuation
ValueCountFrequency (%)
) 580
99.8%
] 1
 
0.2%
Space Separator
ValueCountFrequency (%)
1261510
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 499756
100.0%
Format
ValueCountFrequency (%)
­ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5118749
72.2%
Latin 1975097
 
27.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
u 208644
 
10.6%
r 158708
 
8.0%
a 157043
 
8.0%
J 147803
 
7.5%
A 124847
 
6.3%
e 120723
 
6.1%
M 113882
 
5.8%
n 97727
 
4.9%
p 94749
 
4.8%
y 81426
 
4.1%
Other values (39) 669545
33.9%
Common
ValueCountFrequency (%)
1261510
24.6%
1 874532
17.1%
9 688315
13.4%
- 499756
 
9.8%
2 328876
 
6.4%
0 243409
 
4.8%
6 227222
 
4.4%
7 227024
 
4.4%
8 217953
 
4.3%
3 176664
 
3.5%
Other values (15) 373488
 
7.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7093845
> 99.9%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1261510
17.8%
1 874532
 
12.3%
9 688315
 
9.7%
- 499756
 
7.0%
2 328876
 
4.6%
0 243409
 
3.4%
6 227222
 
3.2%
7 227024
 
3.2%
8 217953
 
3.1%
u 208644
 
2.9%
Other values (63) 2316604
32.7%
None
ValueCountFrequency (%)
­ 1
100.0%
Distinct6286
Distinct (%)1.1%
Missing4414
Missing (%)0.8%
Memory size4.5 MiB
2025-01-14T11:50:26.348607image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length167
Median length118
Mean length48.81643259
Min length4

Characters and Unicode

Total characters28303133
Distinct characters83
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1092 ?
Unique (%)0.2%

Sample

1st rowOceania, Papua New Guinea, Central Province, Kairuku-Hiri District, New Guinea
2nd rowNorth America, United States, North Carolina, Buncombe - Yancey
3rd rowOceania, Pacific Ocean , Tonga, Tonga Islands, Tongatapu Island Group, Tonga Islands
4th rowNorth America, Grenada, St. George Parish, Lesser Antilles, Windward Islands, Grenada Island
5th rowNorth America, United States, Virginia, Augusta
ValueCountFrequency (%)
america 483266
 
12.9%
north 476209
 
12.7%
states 351020
 
9.4%
united 349359
 
9.4%
virginia 96173
 
2.6%
south 71896
 
1.9%
islands 71471
 
1.9%
carolina 61728
 
1.7%
54664
 
1.5%
asia 39306
 
1.1%
Other values (4622) 1680221
45.0%
2025-01-14T11:50:26.610440image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3155526
 
11.1%
a 2668328
 
9.4%
i 2176293
 
7.7%
e 2119779
 
7.5%
t 1973350
 
7.0%
r 1844062
 
6.5%
, 1669519
 
5.9%
n 1511861
 
5.3%
o 1298349
 
4.6%
s 1011828
 
3.6%
Other values (73) 8874238
31.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 19740985
69.7%
Uppercase Letter 3656252
 
12.9%
Space Separator 3155526
 
11.1%
Other Punctuation 1685470
 
6.0%
Dash Punctuation 42316
 
0.1%
Open Punctuation 11057
 
< 0.1%
Close Punctuation 11052
 
< 0.1%
Math Symbol 409
 
< 0.1%
Decimal Number 64
 
< 0.1%
Modifier Letter 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2668328
13.5%
i 2176293
11.0%
e 2119779
10.7%
t 1973350
10.0%
r 1844062
9.3%
n 1511861
7.7%
o 1298349
 
6.6%
s 1011828
 
5.1%
c 896985
 
4.5%
h 740667
 
3.8%
Other values (28) 3499483
17.7%
Uppercase Letter
ValueCountFrequency (%)
A 644681
17.6%
N 528051
14.4%
S 527098
14.4%
U 359922
9.8%
P 226035
 
6.2%
C 185016
 
5.1%
M 170358
 
4.7%
I 135691
 
3.7%
V 116151
 
3.2%
G 115778
 
3.2%
Other values (18) 647471
17.7%
Other Punctuation
ValueCountFrequency (%)
, 1669519
99.1%
. 13671
 
0.8%
' 2228
 
0.1%
? 41
 
< 0.1%
/ 11
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 42077
99.4%
239
 
0.6%
Open Punctuation
ValueCountFrequency (%)
( 10381
93.9%
[ 676
 
6.1%
Close Punctuation
ValueCountFrequency (%)
) 10376
93.9%
] 676
 
6.1%
Math Symbol
ValueCountFrequency (%)
= 389
95.1%
+ 20
 
4.9%
Decimal Number
ValueCountFrequency (%)
1 32
50.0%
0 32
50.0%
Space Separator
ValueCountFrequency (%)
3155526
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 23397237
82.7%
Common 4905896
 
17.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2668328
 
11.4%
i 2176293
 
9.3%
e 2119779
 
9.1%
t 1973350
 
8.4%
r 1844062
 
7.9%
n 1511861
 
6.5%
o 1298349
 
5.5%
s 1011828
 
4.3%
c 896985
 
3.8%
h 740667
 
3.2%
Other values (56) 7155735
30.6%
Common
ValueCountFrequency (%)
3155526
64.3%
, 1669519
34.0%
- 42077
 
0.9%
. 13671
 
0.3%
( 10381
 
0.2%
) 10376
 
0.2%
' 2228
 
< 0.1%
[ 676
 
< 0.1%
] 676
 
< 0.1%
= 389
 
< 0.1%
Other values (7) 377
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 28276056
99.9%
None 26786
 
0.1%
Punctuation 239
 
< 0.1%
Latin Ext Additional 50
 
< 0.1%
Modifier Letters 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3155526
 
11.2%
a 2668328
 
9.4%
i 2176293
 
7.7%
e 2119779
 
7.5%
t 1973350
 
7.0%
r 1844062
 
6.5%
, 1669519
 
5.9%
n 1511861
 
5.3%
o 1298349
 
4.6%
s 1011828
 
3.6%
Other values (57) 8847161
31.3%
None
ValueCountFrequency (%)
é 6953
26.0%
á 5925
22.1%
ã 4537
16.9%
í 4305
16.1%
ó 3223
12.0%
ô 1182
 
4.4%
ñ 439
 
1.6%
â 51
 
0.2%
Đ 50
 
0.2%
ı 48
 
0.2%
Other values (3) 73
 
0.3%
Punctuation
ValueCountFrequency (%)
239
100.0%
Latin Ext Additional
ValueCountFrequency (%)
50
100.0%
Modifier Letters
ValueCountFrequency (%)
ʻ 2
100.0%
Distinct19
Distinct (%)< 0.1%
Missing4673
Missing (%)0.8%
Memory size4.5 MiB
2025-01-14T11:50:26.672560image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length29
Median length13
Mean length12.48592475
Min length4

Characters and Unicode

Total characters7235943
Distinct characters25
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowOceania
2nd rowNorth America
3rd rowOceania, Pacific Ocean
4th rowNorth America
5th rowNorth America
ValueCountFrequency (%)
america 483251
43.2%
north 418511
37.4%
south 64740
 
5.8%
asia 39303
 
3.5%
oceania 32002
 
2.9%
ocean 28207
 
2.5%
pacific 26665
 
2.4%
africa 20689
 
1.8%
europe 2403
 
0.2%
australia 1401
 
0.1%
Other values (2) 1542
 
0.1%
2025-01-14T11:50:26.783090image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
r 926255
12.8%
a 666463
9.2%
i 631518
8.7%
c 617823
8.5%
e 545863
7.5%
A 544988
7.5%
539186
7.5%
o 485654
 
6.7%
t 485340
 
6.7%
h 483251
 
6.7%
Other values (15) 1309602
18.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5550315
76.7%
Uppercase Letter 1118714
 
15.5%
Space Separator 539186
 
7.5%
Other Punctuation 27728
 
0.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 926255
16.7%
a 666463
12.0%
i 631518
11.4%
c 617823
11.1%
e 545863
9.8%
o 485654
8.8%
t 485340
8.7%
h 483251
8.7%
m 483251
8.7%
u 68544
 
1.2%
Other values (6) 156353
 
2.8%
Uppercase Letter
ValueCountFrequency (%)
A 544988
48.7%
N 418511
37.4%
S 64740
 
5.8%
O 60209
 
5.4%
P 26665
 
2.4%
E 2403
 
0.2%
I 1198
 
0.1%
Space Separator
ValueCountFrequency (%)
539186
100.0%
Other Punctuation
ValueCountFrequency (%)
, 27728
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6669029
92.2%
Common 566914
 
7.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 926255
13.9%
a 666463
10.0%
i 631518
9.5%
c 617823
9.3%
e 545863
8.2%
A 544988
8.2%
o 485654
7.3%
t 485340
7.3%
h 483251
7.2%
m 483251
7.2%
Other values (13) 798623
12.0%
Common
ValueCountFrequency (%)
539186
95.1%
, 27728
 
4.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7235943
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 926255
12.8%
a 666463
9.2%
i 631518
8.7%
c 617823
8.5%
e 545863
7.5%
A 544988
7.5%
539186
7.5%
o 485654
 
6.7%
t 485340
 
6.7%
h 483251
 
6.7%
Other values (15) 1309602
18.1%

waterBody
Text

Missing 

Distinct3
Distinct (%)< 0.1%
Missing555994
Missing (%)95.2%
Memory size4.5 MiB
2025-01-14T11:50:26.829646image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length14
Median length13
Mean length12.96972383
Min length12

Characters and Unicode

Total characters365837
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPacific Ocean
2nd rowPacific Ocean
3rd rowPacific Ocean
4th rowPacific Ocean
5th rowIndian Ocean
ValueCountFrequency (%)
ocean 28207
50.0%
pacific 26665
47.3%
indian 1198
 
2.1%
atlantic 344
 
0.6%
2025-01-14T11:50:26.936353image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
c 81881
22.4%
a 56414
15.4%
i 54872
15.0%
n 30947
 
8.5%
28207
 
7.7%
O 28207
 
7.7%
e 28207
 
7.7%
P 26665
 
7.3%
f 26665
 
7.3%
I 1198
 
0.3%
Other values (4) 2574
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 281216
76.9%
Uppercase Letter 56414
 
15.4%
Space Separator 28207
 
7.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c 81881
29.1%
a 56414
20.1%
i 54872
19.5%
n 30947
 
11.0%
e 28207
 
10.0%
f 26665
 
9.5%
d 1198
 
0.4%
t 688
 
0.2%
l 344
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
O 28207
50.0%
P 26665
47.3%
I 1198
 
2.1%
A 344
 
0.6%
Space Separator
ValueCountFrequency (%)
28207
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 337630
92.3%
Common 28207
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
c 81881
24.3%
a 56414
16.7%
i 54872
16.3%
n 30947
 
9.2%
O 28207
 
8.4%
e 28207
 
8.4%
P 26665
 
7.9%
f 26665
 
7.9%
I 1198
 
0.4%
d 1198
 
0.4%
Other values (3) 1376
 
0.4%
Common
ValueCountFrequency (%)
28207
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 365837
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c 81881
22.4%
a 56414
15.4%
i 54872
15.0%
n 30947
 
8.5%
28207
 
7.7%
O 28207
 
7.7%
e 28207
 
7.7%
P 26665
 
7.3%
f 26665
 
7.3%
I 1198
 
0.3%
Other values (4) 2574
 
0.7%

islandGroup
Text

Missing 

Distinct41
Distinct (%)0.2%
Missing564324
Missing (%)96.6%
Memory size4.5 MiB
2025-01-14T11:50:27.007704image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length31
Median length25
Mean length13.3327967
Min length10

Characters and Unicode

Total characters265016
Distinct characters45
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)< 0.1%

Sample

1st rowWindward Islands
2nd rowVirgin Islands
3rd rowHispaniola
4th rowHispaniola
5th rowGreater Sunda Islands
ValueCountFrequency (%)
islands 10225
31.0%
hispaniola 8927
27.1%
virgin 2527
 
7.7%
windward 2377
 
7.2%
bahama 1504
 
4.6%
leeward 1357
 
4.1%
sunda 1019
 
3.1%
greater 1018
 
3.1%
northern 671
 
2.0%
solomon 655
 
2.0%
Other values (48) 2663
 
8.1%
2025-01-14T11:50:27.139364image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 41073
15.5%
s 30081
11.4%
n 27407
10.3%
i 26949
10.2%
l 20663
 
7.8%
d 17902
 
6.8%
13066
 
4.9%
o 12195
 
4.6%
r 10747
 
4.1%
I 10283
 
3.9%
Other values (35) 54650
20.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 218943
82.6%
Uppercase Letter 32978
 
12.4%
Space Separator 13066
 
4.9%
Open Punctuation 8
 
< 0.1%
Math Symbol 8
 
< 0.1%
Close Punctuation 8
 
< 0.1%
Other Punctuation 5
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 41073
18.8%
s 30081
13.7%
n 27407
12.5%
i 26949
12.3%
l 20663
9.4%
d 17902
8.2%
o 12195
 
5.6%
r 10747
 
4.9%
p 9142
 
4.2%
e 5828
 
2.7%
Other values (13) 16956
7.7%
Uppercase Letter
ValueCountFrequency (%)
I 10283
31.2%
H 8927
27.1%
V 2533
 
7.7%
W 2377
 
7.2%
S 1736
 
5.3%
B 1647
 
5.0%
L 1372
 
4.2%
G 1061
 
3.2%
C 934
 
2.8%
N 775
 
2.4%
Other values (7) 1333
 
4.0%
Space Separator
ValueCountFrequency (%)
13066
100.0%
Open Punctuation
ValueCountFrequency (%)
( 8
100.0%
Math Symbol
ValueCountFrequency (%)
= 8
100.0%
Close Punctuation
ValueCountFrequency (%)
) 8
100.0%
Other Punctuation
ValueCountFrequency (%)
. 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 251921
95.1%
Common 13095
 
4.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 41073
16.3%
s 30081
11.9%
n 27407
10.9%
i 26949
10.7%
l 20663
8.2%
d 17902
 
7.1%
o 12195
 
4.8%
r 10747
 
4.3%
I 10283
 
4.1%
p 9142
 
3.6%
Other values (30) 45479
18.1%
Common
ValueCountFrequency (%)
13066
99.8%
( 8
 
0.1%
= 8
 
0.1%
) 8
 
0.1%
. 5
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 265016
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 41073
15.5%
s 30081
11.4%
n 27407
10.3%
i 26949
10.2%
l 20663
 
7.8%
d 17902
 
6.8%
13066
 
4.9%
o 12195
 
4.6%
r 10747
 
4.1%
I 10283
 
3.9%
Other values (35) 54650
20.6%

island
Text

Missing 

Distinct39
Distinct (%)0.5%
Missing576136
Missing (%)98.6%
Memory size4.5 MiB
2025-01-14T11:50:27.216275image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length20
Median length10
Mean length10.77445753
Min length6

Characters and Unicode

Total characters86896
Distinct characters44
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)0.1%

Sample

1st rowNew Guinea
2nd rowGrenada Island
3rd rowNew Guinea
4th rowNew Guinea
5th rowLittle Swan Island
ValueCountFrequency (%)
new 4350
29.0%
guinea 4350
29.0%
island 1306
 
8.7%
borneo 712
 
4.7%
bougainville 652
 
4.3%
sumatra 558
 
3.7%
okinawa 493
 
3.3%
grenada 267
 
1.8%
isla 258
 
1.7%
swan 241
 
1.6%
Other values (44) 1803
12.0%
2025-01-14T11:50:27.357475image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 11374
13.1%
a 10388
12.0%
n 8928
10.3%
6925
 
8.0%
i 6731
 
7.7%
u 5716
 
6.6%
w 5086
 
5.9%
G 4959
 
5.7%
N 4459
 
5.1%
l 3060
 
3.5%
Other values (34) 19270
22.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 65206
75.0%
Uppercase Letter 14765
 
17.0%
Space Separator 6925
 
8.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 11374
17.4%
a 10388
15.9%
n 8928
13.7%
i 6731
10.3%
u 5716
8.8%
w 5086
7.8%
l 3060
 
4.7%
o 2768
 
4.2%
d 2350
 
3.6%
s 2071
 
3.2%
Other values (14) 6734
10.3%
Uppercase Letter
ValueCountFrequency (%)
G 4959
33.6%
N 4459
30.2%
I 1683
 
11.4%
B 1407
 
9.5%
S 841
 
5.7%
O 512
 
3.5%
U 199
 
1.3%
K 190
 
1.3%
L 178
 
1.2%
R 151
 
1.0%
Other values (9) 186
 
1.3%
Space Separator
ValueCountFrequency (%)
6925
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 79971
92.0%
Common 6925
 
8.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 11374
14.2%
a 10388
13.0%
n 8928
11.2%
i 6731
8.4%
u 5716
 
7.1%
w 5086
 
6.4%
G 4959
 
6.2%
N 4459
 
5.6%
l 3060
 
3.8%
o 2768
 
3.5%
Other values (33) 16502
20.6%
Common
ValueCountFrequency (%)
6925
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 86748
99.8%
None 148
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 11374
13.1%
a 10388
12.0%
n 8928
10.3%
6925
 
8.0%
i 6731
 
7.8%
u 5716
 
6.6%
w 5086
 
5.9%
G 4959
 
5.7%
N 4459
 
5.1%
l 3060
 
3.5%
Other values (33) 19122
22.0%
None
ValueCountFrequency (%)
á 148
100.0%
Distinct235
Distinct (%)< 0.1%
Missing5014
Missing (%)0.9%
Memory size4.5 MiB
2025-01-14T11:50:27.540570image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length44
Median length13
Mean length11.36707143
Min length4

Characters and Unicode

Total characters6583660
Distinct characters61
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16 ?
Unique (%)< 0.1%

Sample

1st rowPapua New Guinea
2nd rowUnited States
3rd rowTonga
4th rowGrenada
5th rowUnited States
ValueCountFrequency (%)
states 351011
35.2%
united 349162
35.0%
mexico 22872
 
2.3%
ecuador 16235
 
1.6%
brazil 14751
 
1.5%
territory 13632
 
1.4%
peru 12875
 
1.3%
philippines 11392
 
1.1%
honduras 10938
 
1.1%
panama 7692
 
0.8%
Other values (255) 187970
18.8%
2025-01-14T11:50:27.806311image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 1097918
16.7%
e 826989
12.6%
a 632853
9.6%
i 556170
8.4%
n 472266
7.2%
419343
 
6.4%
s 408842
 
6.2%
d 407821
 
6.2%
S 361495
 
5.5%
U 349907
 
5.3%
Other values (51) 1050056
15.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5166158
78.5%
Uppercase Letter 990793
 
15.0%
Space Separator 419343
 
6.4%
Other Punctuation 4735
 
0.1%
Open Punctuation 1221
 
< 0.1%
Close Punctuation 1221
 
< 0.1%
Dash Punctuation 189
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 1097918
21.3%
e 826989
16.0%
a 632853
12.2%
i 556170
10.8%
n 472266
9.1%
s 408842
 
7.9%
d 407821
 
7.9%
r 137195
 
2.7%
o 125533
 
2.4%
u 92494
 
1.8%
Other values (19) 408077
 
7.9%
Uppercase Letter
ValueCountFrequency (%)
S 361495
36.5%
U 349907
35.3%
P 45641
 
4.6%
M 31265
 
3.2%
T 27087
 
2.7%
B 26397
 
2.7%
C 25077
 
2.5%
E 19880
 
2.0%
H 14756
 
1.5%
G 12513
 
1.3%
Other values (14) 76775
 
7.7%
Other Punctuation
ValueCountFrequency (%)
, 3713
78.4%
. 1022
 
21.6%
Open Punctuation
ValueCountFrequency (%)
[ 676
55.4%
( 545
44.6%
Close Punctuation
ValueCountFrequency (%)
] 676
55.4%
) 545
44.6%
Space Separator
ValueCountFrequency (%)
419343
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 189
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6156951
93.5%
Common 426709
 
6.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 1097918
17.8%
e 826989
13.4%
a 632853
10.3%
i 556170
9.0%
n 472266
7.7%
s 408842
 
6.6%
d 407821
 
6.6%
S 361495
 
5.9%
U 349907
 
5.7%
r 137195
 
2.2%
Other values (43) 905495
14.7%
Common
ValueCountFrequency (%)
419343
98.3%
, 3713
 
0.9%
. 1022
 
0.2%
[ 676
 
0.2%
] 676
 
0.2%
( 545
 
0.1%
) 545
 
0.1%
- 189
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6581073
> 99.9%
None 2587
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 1097918
16.7%
e 826989
12.6%
a 632853
9.6%
i 556170
8.5%
n 472266
7.2%
419343
 
6.4%
s 408842
 
6.2%
d 407821
 
6.2%
S 361495
 
5.5%
U 349907
 
5.3%
Other values (48) 1047469
15.9%
None
ValueCountFrequency (%)
é 893
34.5%
í 847
32.7%
ã 847
32.7%

stateProvince
Text

Missing 

Distinct2059
Distinct (%)0.4%
Missing17001
Missing (%)2.9%
Memory size4.5 MiB
2025-01-14T11:50:28.003047image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length69
Median length52
Mean length10.58665021
Min length3

Characters and Unicode

Total characters6004748
Distinct characters72
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique356 ?
Unique (%)0.1%

Sample

1st rowCentral Province
2nd rowNorth Carolina
3rd rowTonga Islands
4th rowSt. George Parish
5th rowVirginia
ValueCountFrequency (%)
virginia 93314
 
11.0%
carolina 61709
 
7.2%
north 57614
 
6.8%
maryland 32649
 
3.8%
province 27443
 
3.2%
pennsylvania 18911
 
2.2%
west 18140
 
2.1%
florida 18100
 
2.1%
island 18015
 
2.1%
tennessee 17444
 
2.0%
Other values (1937) 487863
57.3%
2025-01-14T11:50:28.284774image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 826291
13.8%
i 632216
 
10.5%
n 557794
 
9.3%
r 474453
 
7.9%
o 407390
 
6.8%
e 304504
 
5.1%
284002
 
4.7%
l 264922
 
4.4%
s 256173
 
4.3%
t 191100
 
3.2%
Other values (62) 1805903
30.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4862616
81.0%
Uppercase Letter 830502
 
13.8%
Space Separator 284002
 
4.7%
Dash Punctuation 16262
 
0.3%
Other Punctuation 9979
 
0.2%
Open Punctuation 537
 
< 0.1%
Close Punctuation 532
 
< 0.1%
Math Symbol 318
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 826291
17.0%
i 632216
13.0%
n 557794
11.5%
r 474453
9.8%
o 407390
8.4%
e 304504
 
6.3%
l 264922
 
5.4%
s 256173
 
5.3%
t 191100
 
3.9%
g 142256
 
2.9%
Other values (24) 805517
16.6%
Uppercase Letter
ValueCountFrequency (%)
C 108499
13.1%
V 99187
11.9%
P 90787
10.9%
N 84504
10.2%
M 71209
8.6%
I 44333
 
5.3%
S 44167
 
5.3%
T 42704
 
5.1%
G 35681
 
4.3%
A 34622
 
4.2%
Other values (17) 174809
21.0%
Other Punctuation
ValueCountFrequency (%)
. 9193
92.1%
' 757
 
7.6%
? 19
 
0.2%
/ 6
 
0.1%
, 4
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
= 298
93.7%
+ 20
 
6.3%
Space Separator
ValueCountFrequency (%)
284002
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 16262
100.0%
Open Punctuation
ValueCountFrequency (%)
( 537
100.0%
Close Punctuation
ValueCountFrequency (%)
) 532
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5693118
94.8%
Common 311630
 
5.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 826291
14.5%
i 632216
 
11.1%
n 557794
 
9.8%
r 474453
 
8.3%
o 407390
 
7.2%
e 304504
 
5.3%
l 264922
 
4.7%
s 256173
 
4.5%
t 191100
 
3.4%
g 142256
 
2.5%
Other values (51) 1636019
28.7%
Common
ValueCountFrequency (%)
284002
91.1%
- 16262
 
5.2%
. 9193
 
2.9%
' 757
 
0.2%
( 537
 
0.2%
) 532
 
0.2%
= 298
 
0.1%
+ 20
 
< 0.1%
? 19
 
< 0.1%
/ 6
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5984875
99.7%
None 19873
 
0.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 826291
13.8%
i 632216
 
10.6%
n 557794
 
9.3%
r 474453
 
7.9%
o 407390
 
6.8%
e 304504
 
5.1%
284002
 
4.7%
l 264922
 
4.4%
s 256173
 
4.3%
t 191100
 
3.2%
Other values (53) 1786030
29.8%
None
ValueCountFrequency (%)
á 4907
24.7%
é 4585
23.1%
ã 3690
18.6%
ó 2908
14.6%
í 2325
11.7%
ô 1036
 
5.2%
ñ 367
 
1.8%
ı 48
 
0.2%
Î 7
 
< 0.1%

county
Text

Missing 

Distinct3056
Distinct (%)0.8%
Missing191557
Missing (%)32.8%
Memory size4.5 MiB
2025-01-14T11:50:28.482083image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length56
Median length43
Mean length9.394395432
Min length3

Characters and Unicode

Total characters3688653
Distinct characters74
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique504 ?
Unique (%)0.1%

Sample

1st rowKairuku-Hiri District
2nd rowBuncombe - Yancey
3rd rowTongatapu Island Group
4th rowAugusta
5th rowElko
ValueCountFrequency (%)
21119
 
3.8%
island 14180
 
2.6%
swain 12742
 
2.3%
city 8568
 
1.6%
province 8458
 
1.5%
giles 8024
 
1.5%
frederick 7508
 
1.4%
macon 7377
 
1.3%
municipality 7367
 
1.3%
haywood 7297
 
1.3%
Other values (2826) 448585
81.4%
2025-01-14T11:50:28.739259image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 361375
 
9.8%
e 318401
 
8.6%
n 281913
 
7.6%
o 250126
 
6.8%
i 237836
 
6.4%
r 221961
 
6.0%
l 181195
 
4.9%
158581
 
4.3%
s 154891
 
4.2%
t 142082
 
3.9%
Other values (64) 1380292
37.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2956891
80.2%
Uppercase Letter 526246
 
14.3%
Space Separator 158581
 
4.3%
Dash Punctuation 25865
 
0.7%
Close Punctuation 7839
 
0.2%
Open Punctuation 7839
 
0.2%
Other Punctuation 5243
 
0.1%
Math Symbol 83
 
< 0.1%
Decimal Number 64
 
< 0.1%
Modifier Letter 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 361375
12.2%
e 318401
10.8%
n 281913
9.5%
o 250126
 
8.5%
i 237836
 
8.0%
r 221961
 
7.5%
l 181195
 
6.1%
s 154891
 
5.2%
t 142082
 
4.8%
c 111847
 
3.8%
Other values (25) 695264
23.5%
Uppercase Letter
ValueCountFrequency (%)
M 56403
 
10.7%
S 49852
 
9.5%
C 48154
 
9.2%
P 46114
 
8.8%
G 36649
 
7.0%
B 29767
 
5.7%
I 27305
 
5.2%
A 26724
 
5.1%
H 24796
 
4.7%
R 21003
 
4.0%
Other values (15) 159479
30.3%
Other Punctuation
ValueCountFrequency (%)
. 3451
65.8%
' 1471
28.1%
, 294
 
5.6%
? 22
 
0.4%
/ 5
 
0.1%
Dash Punctuation
ValueCountFrequency (%)
- 25626
99.1%
239
 
0.9%
Decimal Number
ValueCountFrequency (%)
1 32
50.0%
0 32
50.0%
Space Separator
ValueCountFrequency (%)
158581
100.0%
Close Punctuation
ValueCountFrequency (%)
) 7839
100.0%
Open Punctuation
ValueCountFrequency (%)
( 7839
100.0%
Math Symbol
ValueCountFrequency (%)
= 83
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3483137
94.4%
Common 205516
 
5.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 361375
 
10.4%
e 318401
 
9.1%
n 281913
 
8.1%
o 250126
 
7.2%
i 237836
 
6.8%
r 221961
 
6.4%
l 181195
 
5.2%
s 154891
 
4.4%
t 142082
 
4.1%
c 111847
 
3.2%
Other values (50) 1221510
35.1%
Common
ValueCountFrequency (%)
158581
77.2%
- 25626
 
12.5%
) 7839
 
3.8%
( 7839
 
3.8%
. 3451
 
1.7%
' 1471
 
0.7%
, 294
 
0.1%
239
 
0.1%
= 83
 
< 0.1%
1 32
 
< 0.1%
Other values (4) 61
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3684587
99.9%
None 3825
 
0.1%
Punctuation 239
 
< 0.1%
Modifier Letters 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 361375
 
9.8%
e 318401
 
8.6%
n 281913
 
7.7%
o 250126
 
6.8%
i 237836
 
6.5%
r 221961
 
6.0%
l 181195
 
4.9%
158581
 
4.3%
s 154891
 
4.2%
t 142082
 
3.9%
Other values (53) 1376226
37.4%
None
ValueCountFrequency (%)
é 1444
37.8%
í 911
23.8%
á 870
22.7%
ó 315
 
8.2%
ô 96
 
2.5%
ñ 72
 
1.9%
â 51
 
1.3%
ü 38
 
1.0%
è 28
 
0.7%
Punctuation
ValueCountFrequency (%)
239
100.0%
Modifier Letters
ValueCountFrequency (%)
ʻ 2
100.0%
Distinct56650
Distinct (%)9.7%
Missing2303
Missing (%)0.4%
Memory size4.5 MiB
2025-01-14T11:50:28.935771image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length295
Median length193
Mean length54.40064066
Min length2

Characters and Unicode

Total characters31655624
Distinct characters110
Distinct categories12 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique25059 ?
Unique (%)4.3%

Sample

1st rowKairuku, Yule Island
2nd rowPisgah National Forest, near Cane River Gap
3rd rowNo Locality Data
4th rowTongatapu Island, adjacent to Fua'amotu Airport
5th rowGrand Anse Bay, west end of, along road to jetty just east of base of Quarantine Point
ValueCountFrequency (%)
of 456712
 
8.0%
mi 190409
 
3.3%
road 182915
 
3.2%
route 156226
 
2.7%
on 147202
 
2.6%
national 106083
 
1.8%
by 93415
 
1.6%
forest 89661
 
1.6%
junction 81776
 
1.4%
km 68711
 
1.2%
Other values (30771) 4165761
72.6%
2025-01-14T11:50:29.207600image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5156973
16.3%
a 2379871
 
7.5%
o 2379316
 
7.5%
e 1741456
 
5.5%
n 1661294
 
5.2%
i 1563818
 
4.9%
t 1519591
 
4.8%
r 1285474
 
4.1%
l 960809
 
3.0%
, 845140
 
2.7%
Other values (100) 12161882
38.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 19579495
61.9%
Space Separator 5156973
 
16.3%
Uppercase Letter 4046777
 
12.8%
Other Punctuation 1240931
 
3.9%
Decimal Number 1169470
 
3.7%
Open Punctuation 200092
 
0.6%
Close Punctuation 200069
 
0.6%
Dash Punctuation 36149
 
0.1%
Math Symbol 25534
 
0.1%
Format 126
 
< 0.1%
Other values (2) 8
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2379871
12.2%
o 2379316
12.2%
e 1741456
 
8.9%
n 1661294
 
8.5%
i 1563818
 
8.0%
t 1519591
 
7.8%
r 1285474
 
6.6%
l 960809
 
4.9%
u 734887
 
3.8%
s 723098
 
3.7%
Other values (38) 4629881
23.6%
Uppercase Letter
ValueCountFrequency (%)
R 450616
 
11.1%
S 430589
 
10.6%
N 383011
 
9.5%
C 258686
 
6.4%
M 234251
 
5.8%
E 231690
 
5.7%
W 220124
 
5.4%
P 210033
 
5.2%
A 197529
 
4.9%
F 190404
 
4.7%
Other values (18) 1239844
30.6%
Other Punctuation
ValueCountFrequency (%)
, 845140
68.1%
. 367701
29.6%
' 12150
 
1.0%
; 6267
 
0.5%
/ 6172
 
0.5%
" 1760
 
0.1%
: 693
 
0.1%
? 661
 
0.1%
# 351
 
< 0.1%
& 36
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 222198
19.0%
0 175576
15.0%
2 157851
13.5%
5 124586
10.7%
3 116675
10.0%
6 105408
9.0%
4 93349
8.0%
7 69091
 
5.9%
8 55301
 
4.7%
9 49435
 
4.2%
Open Punctuation
ValueCountFrequency (%)
( 200029
> 99.9%
[ 62
 
< 0.1%
1
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
= 24365
95.4%
+ 1165
 
4.6%
< 4
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 200007
> 99.9%
] 62
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 36140
> 99.9%
9
 
< 0.1%
Space Separator
ValueCountFrequency (%)
5156973
100.0%
Format
ValueCountFrequency (%)
­ 126
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 7
100.0%
Control
ValueCountFrequency (%)
 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 23626272
74.6%
Common 8029352
 
25.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2379871
 
10.1%
o 2379316
 
10.1%
e 1741456
 
7.4%
n 1661294
 
7.0%
i 1563818
 
6.6%
t 1519591
 
6.4%
r 1285474
 
5.4%
l 960809
 
4.1%
u 734887
 
3.1%
s 723098
 
3.1%
Other values (66) 8676658
36.7%
Common
ValueCountFrequency (%)
5156973
64.2%
, 845140
 
10.5%
. 367701
 
4.6%
1 222198
 
2.8%
( 200029
 
2.5%
) 200007
 
2.5%
0 175576
 
2.2%
2 157851
 
2.0%
5 124586
 
1.6%
3 116675
 
1.5%
Other values (24) 462616
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 31626455
99.9%
None 29146
 
0.1%
Latin Ext Additional 13
 
< 0.1%
Punctuation 10
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5156973
16.3%
a 2379871
 
7.5%
o 2379316
 
7.5%
e 1741456
 
5.5%
n 1661294
 
5.3%
i 1563818
 
4.9%
t 1519591
 
4.8%
r 1285474
 
4.1%
l 960809
 
3.0%
, 845140
 
2.7%
Other values (73) 12132713
38.4%
None
ValueCountFrequency (%)
í 24109
82.7%
é 1678
 
5.8%
á 1098
 
3.8%
ñ 788
 
2.7%
â 452
 
1.6%
ó 240
 
0.8%
ú 196
 
0.7%
ô 169
 
0.6%
­ 126
 
0.4%
è 59
 
0.2%
Other values (12) 231
 
0.8%
Latin Ext Additional
ValueCountFrequency (%)
9
69.2%
2
 
15.4%
2
 
15.4%
Punctuation
ValueCountFrequency (%)
9
90.0%
1
 
10.0%
Distinct1393
Distinct (%)0.6%
Missing332173
Missing (%)56.9%
Memory size4.5 MiB
2025-01-14T11:50:29.407990image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length6
Median length5
Mean length5.17383386
Min length3

Characters and Unicode

Total characters1303951
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique172 ?
Unique (%)0.1%

Sample

1st row1317.0
2nd row1326.0
3rd row2200.0
4th row30.0
5th row9.0
ValueCountFrequency (%)
335.0 5696
 
2.3%
1067.0 4475
 
1.8%
200.0 3544
 
1.4%
1036.0 3021
 
1.2%
91.0 2873
 
1.1%
3.0 2426
 
1.0%
280.0 2242
 
0.9%
6.0 2185
 
0.9%
320.0 2140
 
0.8%
30.0 2121
 
0.8%
Other values (1380) 221305
87.8%
2025-01-14T11:50:29.669859image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 367971
28.2%
. 252028
19.3%
1 174955
13.4%
2 79232
 
6.1%
3 77185
 
5.9%
5 67603
 
5.2%
4 64449
 
4.9%
6 61727
 
4.7%
9 55836
 
4.3%
7 54079
 
4.1%
Other values (2) 48886
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1051918
80.7%
Other Punctuation 252028
 
19.3%
Dash Punctuation 5
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 367971
35.0%
1 174955
16.6%
2 79232
 
7.5%
3 77185
 
7.3%
5 67603
 
6.4%
4 64449
 
6.1%
6 61727
 
5.9%
9 55836
 
5.3%
7 54079
 
5.1%
8 48881
 
4.6%
Other Punctuation
ValueCountFrequency (%)
. 252028
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1303951
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 367971
28.2%
. 252028
19.3%
1 174955
13.4%
2 79232
 
6.1%
3 77185
 
5.9%
5 67603
 
5.2%
4 64449
 
4.9%
6 61727
 
4.7%
9 55836
 
4.3%
7 54079
 
4.1%
Other values (2) 48886
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1303951
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 367971
28.2%
. 252028
19.3%
1 174955
13.4%
2 79232
 
6.1%
3 77185
 
5.9%
5 67603
 
5.2%
4 64449
 
4.9%
6 61727
 
4.7%
9 55836
 
4.3%
7 54079
 
4.1%
Other values (2) 48886
 
3.7%
Distinct1386
Distinct (%)0.6%
Missing333225
Missing (%)57.0%
Memory size4.5 MiB
2025-01-14T11:50:29.872264image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length6
Median length5
Mean length5.186667251
Min length3

Characters and Unicode

Total characters1301729
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique173 ?
Unique (%)0.1%

Sample

1st row1317.0
2nd row1326.0
3rd row2200.0
4th row50.0
5th row9.0
ValueCountFrequency (%)
411.0 5198
 
2.1%
1067.0 4371
 
1.7%
1036.0 3919
 
1.6%
200.0 2888
 
1.2%
1146.0 2811
 
1.1%
975.0 2590
 
1.0%
280.0 2519
 
1.0%
3.0 2326
 
0.9%
6.0 2222
 
0.9%
1189.0 2174
 
0.9%
Other values (1373) 219958
87.6%
2025-01-14T11:50:30.133358image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 362944
27.9%
. 250976
19.3%
1 183815
14.1%
2 79825
 
6.1%
3 69914
 
5.4%
4 67822
 
5.2%
6 62985
 
4.8%
5 62697
 
4.8%
7 57607
 
4.4%
9 54462
 
4.2%
Other values (2) 48682
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1050748
80.7%
Other Punctuation 250976
 
19.3%
Dash Punctuation 5
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 362944
34.5%
1 183815
17.5%
2 79825
 
7.6%
3 69914
 
6.7%
4 67822
 
6.5%
6 62985
 
6.0%
5 62697
 
6.0%
7 57607
 
5.5%
9 54462
 
5.2%
8 48677
 
4.6%
Other Punctuation
ValueCountFrequency (%)
. 250976
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1301729
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 362944
27.9%
. 250976
19.3%
1 183815
14.1%
2 79825
 
6.1%
3 69914
 
5.4%
4 67822
 
5.2%
6 62985
 
4.8%
5 62697
 
4.8%
7 57607
 
4.4%
9 54462
 
4.2%
Other values (2) 48682
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1301729
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 362944
27.9%
. 250976
19.3%
1 183815
14.1%
2 79825
 
6.1%
3 69914
 
5.4%
4 67822
 
5.2%
6 62985
 
4.8%
5 62697
 
4.8%
7 57607
 
4.4%
9 54462
 
4.2%
Other values (2) 48682
 
3.7%

verbatimElevation
Text

Missing 

Distinct2882
Distinct (%)1.1%
Missing331608
Missing (%)56.8%
Memory size4.5 MiB
2025-01-14T11:50:30.319983image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length93
Median length46
Mean length7.093015246
Min length3

Characters and Unicode

Total characters1791646
Distinct characters57
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique530 ?
Unique (%)0.2%

Sample

1st row4320 ft
2nd row4351 ft
3rd row2200 m
4th row30-50 m
5th row30 ft
ValueCountFrequency (%)
ft 191831
36.8%
m 59860
 
11.5%
ca 13358
 
2.6%
1100-1350 4058
 
0.8%
200 3781
 
0.7%
10 3450
 
0.7%
3400 2848
 
0.5%
3500 2819
 
0.5%
20 2706
 
0.5%
3600 2513
 
0.5%
Other values (2009) 234300
44.9%
2025-01-14T11:50:30.573020image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 376273
21.0%
268931
15.0%
t 192412
10.7%
f 192004
10.7%
1 99566
 
5.6%
3 96808
 
5.4%
2 90988
 
5.1%
4 83319
 
4.7%
5 76675
 
4.3%
m 59946
 
3.3%
Other values (47) 254724
14.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 994929
55.5%
Lowercase Letter 481690
26.9%
Space Separator 268931
 
15.0%
Dash Punctuation 30052
 
1.7%
Other Punctuation 13757
 
0.8%
Close Punctuation 1006
 
0.1%
Open Punctuation 1006
 
0.1%
Math Symbol 195
 
< 0.1%
Uppercase Letter 80
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 192412
39.9%
f 192004
39.9%
m 59946
 
12.4%
a 14859
 
3.1%
c 13366
 
2.8%
e 3277
 
0.7%
l 1590
 
0.3%
v 1058
 
0.2%
s 835
 
0.2%
o 611
 
0.1%
Other values (15) 1732
 
0.4%
Decimal Number
ValueCountFrequency (%)
0 376273
37.8%
1 99566
 
10.0%
3 96808
 
9.7%
2 90988
 
9.1%
4 83319
 
8.4%
5 76675
 
7.7%
6 59540
 
6.0%
8 45372
 
4.6%
7 38030
 
3.8%
9 28358
 
2.9%
Uppercase Letter
ValueCountFrequency (%)
C 23
28.7%
S 15
18.8%
P 12
15.0%
G 12
15.0%
A 10
12.5%
D 5
 
6.2%
L 2
 
2.5%
M 1
 
1.2%
Other Punctuation
ValueCountFrequency (%)
. 13576
98.7%
, 90
 
0.7%
/ 39
 
0.3%
; 22
 
0.2%
? 22
 
0.2%
' 6
 
< 0.1%
2
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
< 110
56.4%
+ 75
38.5%
= 10
 
5.1%
Space Separator
ValueCountFrequency (%)
268931
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 30052
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1006
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1006
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1309876
73.1%
Latin 481770
 
26.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 192412
39.9%
f 192004
39.9%
m 59946
 
12.4%
a 14859
 
3.1%
c 13366
 
2.8%
e 3277
 
0.7%
l 1590
 
0.3%
v 1058
 
0.2%
s 835
 
0.2%
o 611
 
0.1%
Other values (23) 1812
 
0.4%
Common
ValueCountFrequency (%)
0 376273
28.7%
268931
20.5%
1 99566
 
7.6%
3 96808
 
7.4%
2 90988
 
6.9%
4 83319
 
6.4%
5 76675
 
5.9%
6 59540
 
4.5%
8 45372
 
3.5%
7 38030
 
2.9%
Other values (14) 74374
 
5.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1791644
> 99.9%
Punctuation 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 376273
21.0%
268931
15.0%
t 192412
10.7%
f 192004
10.7%
1 99566
 
5.6%
3 96808
 
5.4%
2 90988
 
5.1%
4 83319
 
4.7%
5 76675
 
4.3%
m 59946
 
3.3%
Other values (46) 254722
14.2%
Punctuation
ValueCountFrequency (%)
2
100.0%

decimalLatitude
Text

Missing 

Distinct23890
Distinct (%)5.7%
Missing162901
Missing (%)27.9%
Memory size4.5 MiB
2025-01-14T11:50:30.790963image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length7
Mean length6.648575837
Min length3

Characters and Unicode

Total characters2801045
Distinct characters13
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7815 ?
Unique (%)1.9%

Sample

1st row-8.8201
2nd row35.8083
3rd row12.0217
4th row38.39
5th row40.9582
ValueCountFrequency (%)
39.6306 4296
 
1.0%
13.6389 2247
 
0.5%
39.8872 1888
 
0.4%
12.83 1754
 
0.4%
26.9844 1718
 
0.4%
4.0147 1664
 
0.4%
37.4161 1535
 
0.4%
36.7631 1511
 
0.4%
25.4017 1483
 
0.4%
36.9486 1468
 
0.3%
Other values (23415) 401736
95.4%
2025-01-14T11:50:31.069709image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 469957
16.8%
. 421300
15.0%
1 234920
8.4%
6 225682
8.1%
8 222143
7.9%
4 219369
7.8%
5 217631
7.8%
7 201596
7.2%
2 197526
7.1%
9 188366
6.7%
Other values (3) 202555
7.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2320649
82.8%
Other Punctuation 421300
 
15.0%
Dash Punctuation 59033
 
2.1%
Uppercase Letter 63
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 469957
20.3%
1 234920
10.1%
6 225682
9.7%
8 222143
9.6%
4 219369
9.5%
5 217631
9.4%
7 201596
8.7%
2 197526
8.5%
9 188366
8.1%
0 143459
 
6.2%
Other Punctuation
ValueCountFrequency (%)
. 421300
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 59033
100.0%
Uppercase Letter
ValueCountFrequency (%)
E 63
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2800982
> 99.9%
Latin 63
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
3 469957
16.8%
. 421300
15.0%
1 234920
8.4%
6 225682
8.1%
8 222143
7.9%
4 219369
7.8%
5 217631
7.8%
7 201596
7.2%
2 197526
7.1%
9 188366
6.7%
Other values (2) 202492
7.2%
Latin
ValueCountFrequency (%)
E 63
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2801045
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 469957
16.8%
. 421300
15.0%
1 234920
8.4%
6 225682
8.1%
8 222143
7.9%
4 219369
7.8%
5 217631
7.8%
7 201596
7.2%
2 197526
7.1%
9 188366
6.7%
Other values (3) 202555
7.2%

decimalLongitude
Text

Missing 

Distinct24293
Distinct (%)5.8%
Missing162901
Missing (%)27.9%
Memory size4.5 MiB
2025-01-14T11:50:31.283565image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length7.51880845
Min length3

Characters and Unicode

Total characters3167674
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7784 ?
Unique (%)1.8%

Sample

1st row146.53
2nd row-82.3481
3rd row-61.7664
4th row-79.25
5th row-115.434
ValueCountFrequency (%)
77.4714 4296
 
1.0%
144.962 2247
 
0.5%
77.7786 2139
 
0.5%
87.1889 1888
 
0.4%
69.28 1763
 
0.4%
81.4919 1718
 
0.4%
80.5097 1653
 
0.4%
81.2228 1509
 
0.4%
80.6567 1483
 
0.4%
79.5561 1463
 
0.3%
Other values (24157) 401141
95.2%
2025-01-14T11:50:31.546172image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 421300
13.3%
- 381965
12.1%
7 372583
11.8%
8 353308
11.2%
1 252479
8.0%
3 236540
7.5%
6 221739
7.0%
9 209505
6.6%
4 196861
6.2%
2 192014
6.1%
Other values (2) 329380
10.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2364409
74.6%
Other Punctuation 421300
 
13.3%
Dash Punctuation 381965
 
12.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 372583
15.8%
8 353308
14.9%
1 252479
10.7%
3 236540
10.0%
6 221739
9.4%
9 209505
8.9%
4 196861
8.3%
2 192014
8.1%
5 191410
8.1%
0 137970
 
5.8%
Other Punctuation
ValueCountFrequency (%)
. 421300
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 381965
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3167674
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 421300
13.3%
- 381965
12.1%
7 372583
11.8%
8 353308
11.2%
1 252479
8.0%
3 236540
7.5%
6 221739
7.0%
9 209505
6.6%
4 196861
6.2%
2 192014
6.1%
Other values (2) 329380
10.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3167674
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 421300
13.3%
- 381965
12.1%
7 372583
11.8%
8 353308
11.2%
1 252479
8.0%
3 236540
7.5%
6 221739
7.0%
9 209505
6.6%
4 196861
6.2%
2 192014
6.1%
Other values (2) 329380
10.4%

geodeticDatum
Text

Missing 

Distinct21
Distinct (%)< 0.1%
Missing438700
Missing (%)75.1%
Memory size4.5 MiB
2025-01-14T11:50:31.616204image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length31
Median length5
Mean length5.585164363
Min length3

Characters and Unicode

Total characters812647
Distinct characters46
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowWGS84
2nd rowNAD27
3rd rowWGS84
4th rowWGS84
5th rowWGS84
ValueCountFrequency (%)
wgs84 84632
53.7%
nad27 33007
 
21.0%
nad83 8733
 
5.5%
prp_m 8459
 
5.4%
not 4217
 
2.7%
recorded 4217
 
2.7%
agd66 2352
 
1.5%
japanese 1809
 
1.1%
geodetic 1809
 
1.1%
datum 1809
 
1.1%
Other values (22) 6483
 
4.1%
2025-01-14T11:50:31.734960image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8 94697
11.7%
G 90266
11.1%
4 86068
10.6%
S 85622
10.5%
W 85612
10.5%
D 46433
 
5.7%
A 44645
 
5.5%
N 42091
 
5.2%
2 34845
 
4.3%
7 33036
 
4.1%
Other values (36) 169332
20.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 431894
53.1%
Decimal Number 268384
33.0%
Lowercase Letter 91871
 
11.3%
Space Separator 12026
 
1.5%
Connector Punctuation 8459
 
1.0%
Open Punctuation 6
 
< 0.1%
Close Punctuation 6
 
< 0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 18818
20.5%
d 11435
12.4%
o 10848
11.8%
t 9034
9.8%
r 8415
9.2%
n 6934
 
7.5%
a 6788
 
7.4%
c 6629
 
7.2%
u 3002
 
3.3%
m 2412
 
2.6%
Other values (8) 7556
8.2%
Uppercase Letter
ValueCountFrequency (%)
G 90266
20.9%
S 85622
19.8%
W 85612
19.8%
D 46433
10.8%
A 44645
10.3%
N 42091
9.7%
P 16928
 
3.9%
R 8667
 
2.0%
M 8459
 
2.0%
J 1809
 
0.4%
Other values (3) 1362
 
0.3%
Decimal Number
ValueCountFrequency (%)
8 94697
35.3%
4 86068
32.1%
2 34845
 
13.0%
7 33036
 
12.3%
3 8786
 
3.3%
0 5427
 
2.0%
6 4705
 
1.8%
9 488
 
0.2%
1 170
 
0.1%
5 162
 
0.1%
Space Separator
ValueCountFrequency (%)
12026
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 8459
100.0%
Open Punctuation
ValueCountFrequency (%)
( 6
100.0%
Close Punctuation
ValueCountFrequency (%)
) 6
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 523765
64.5%
Common 288882
35.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
G 90266
17.2%
S 85622
16.3%
W 85612
16.3%
D 46433
8.9%
A 44645
8.5%
N 42091
8.0%
e 18818
 
3.6%
P 16928
 
3.2%
d 11435
 
2.2%
o 10848
 
2.1%
Other values (21) 71067
13.6%
Common
ValueCountFrequency (%)
8 94697
32.8%
4 86068
29.8%
2 34845
 
12.1%
7 33036
 
11.4%
12026
 
4.2%
3 8786
 
3.0%
_ 8459
 
2.9%
0 5427
 
1.9%
6 4705
 
1.6%
9 488
 
0.2%
Other values (5) 345
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 812647
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8 94697
11.7%
G 90266
11.1%
4 86068
10.6%
S 85622
10.5%
W 85612
10.5%
D 46433
 
5.7%
A 44645
 
5.5%
N 42091
 
5.2%
2 34845
 
4.3%
7 33036
 
4.1%
Other values (36) 169332
20.8%
Distinct7372
Distinct (%)5.1%
Missing439218
Missing (%)75.2%
Memory size4.5 MiB
2025-01-14T11:50:31.931882image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length7
Mean length5.852920687
Min length1

Characters and Unicode

Total characters848574
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2163 ?
Unique (%)1.5%

Sample

1st row402.336
2nd row96.5606
3rd row152901
4th row6115
5th row1754.18
ValueCountFrequency (%)
347.618 1384
 
1.0%
186.684 1338
 
0.9%
4615 1110
 
0.8%
5615 1066
 
0.7%
1066 1030
 
0.7%
3615 978
 
0.7%
5115 953
 
0.7%
4115 946
 
0.7%
177.028 882
 
0.6%
402.336 826
 
0.6%
Other values (7362) 134470
92.7%
2025-01-14T11:50:32.203746image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 113723
13.4%
. 89782
10.6%
2 84938
10.0%
5 81478
9.6%
3 79563
9.4%
4 78325
9.2%
6 73067
8.6%
9 65740
7.7%
8 63393
7.5%
7 60182
7.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 758792
89.4%
Other Punctuation 89782
 
10.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 113723
15.0%
2 84938
11.2%
5 81478
10.7%
3 79563
10.5%
4 78325
10.3%
6 73067
9.6%
9 65740
8.7%
8 63393
8.4%
7 60182
7.9%
0 58383
7.7%
Other Punctuation
ValueCountFrequency (%)
. 89782
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 848574
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 113723
13.4%
. 89782
10.6%
2 84938
10.0%
5 81478
9.6%
3 79563
9.4%
4 78325
9.2%
6 73067
8.6%
9 65740
7.7%
8 63393
7.5%
7 60182
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 848574
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 113723
13.4%
. 89782
10.6%
2 84938
10.0%
5 81478
9.6%
3 79563
9.4%
4 78325
9.2%
6 73067
8.6%
9 65740
7.7%
8 63393
7.5%
7 60182
7.1%

verbatimLatitude
Text

Missing 

Distinct8743
Distinct (%)3.5%
Missing334540
Missing (%)57.3%
Memory size4.5 MiB
2025-01-14T11:50:32.384438image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length22
Median length10
Mean length9.947965441
Min length1

Characters and Unicode

Total characters2483619
Distinct characters22
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2180 ?
Unique (%)0.9%

Sample

1st row35 48 30 N
2nd row15 38 -- N
3rd row18 27 30 N
4th row37 27 15 N
5th row38 02 59 N
ValueCountFrequency (%)
n 226021
23.1%
35 53191
 
5.4%
38 41911
 
4.3%
39 37026
 
3.8%
37 36639
 
3.8%
31610
 
3.2%
36 31118
 
3.2%
s 20880
 
2.1%
40 16364
 
1.7%
50 15334
 
1.6%
Other values (1884) 466720
47.8%
2025-01-14T11:50:32.615618image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
727153
29.3%
3 317112
12.8%
N 226318
 
9.1%
5 189972
 
7.6%
0 159182
 
6.4%
2 158470
 
6.4%
4 154033
 
6.2%
1 141432
 
5.7%
8 85002
 
3.4%
7 81967
 
3.3%
Other values (12) 242978
 
9.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1433412
57.7%
Space Separator 727153
29.3%
Uppercase Letter 247343
 
10.0%
Dash Punctuation 63417
 
2.6%
Other Punctuation 12231
 
0.5%
Other Symbol 38
 
< 0.1%
Modifier Letter 24
 
< 0.1%
Final Punctuation 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 317112
22.1%
5 189972
13.3%
0 159182
11.1%
2 158470
11.1%
4 154033
10.7%
1 141432
9.9%
8 85002
 
5.9%
7 81967
 
5.7%
9 73456
 
5.1%
6 72786
 
5.1%
Other Punctuation
ValueCountFrequency (%)
. 11967
97.8%
' 158
 
1.3%
" 51
 
0.4%
? 38
 
0.3%
; 17
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
N 226318
91.5%
S 21025
 
8.5%
Space Separator
ValueCountFrequency (%)
727153
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 63417
100.0%
Other Symbol
ValueCountFrequency (%)
° 38
100.0%
Modifier Letter
ValueCountFrequency (%)
ʹ 24
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2236276
90.0%
Latin 247343
 
10.0%

Most frequent character per script

Common
ValueCountFrequency (%)
727153
32.5%
3 317112
14.2%
5 189972
 
8.5%
0 159182
 
7.1%
2 158470
 
7.1%
4 154033
 
6.9%
1 141432
 
6.3%
8 85002
 
3.8%
7 81967
 
3.7%
9 73456
 
3.3%
Other values (10) 148497
 
6.6%
Latin
ValueCountFrequency (%)
N 226318
91.5%
S 21025
 
8.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2483556
> 99.9%
None 38
 
< 0.1%
Modifier Letters 24
 
< 0.1%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
727153
29.3%
3 317112
12.8%
N 226318
 
9.1%
5 189972
 
7.6%
0 159182
 
6.4%
2 158470
 
6.4%
4 154033
 
6.2%
1 141432
 
5.7%
8 85002
 
3.4%
7 81967
 
3.3%
Other values (9) 242915
 
9.8%
None
ValueCountFrequency (%)
° 38
100.0%
Modifier Letters
ValueCountFrequency (%)
ʹ 24
100.0%
Punctuation
ValueCountFrequency (%)
1
100.0%

verbatimLongitude
Text

Missing 

Distinct9294
Distinct (%)3.7%
Missing334562
Missing (%)57.3%
Memory size4.5 MiB
2025-01-14T11:50:32.790491image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length11
Mean length10.89844536
Min length3

Characters and Unicode

Total characters2720677
Distinct characters25
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2355 ?
Unique (%)0.9%

Sample

1st row082 20 53 W
2nd row088 15 -- W
3rd row063 33 13 W
4th row077 05 15 W
5th row77 41 22 W
ValueCountFrequency (%)
w 225797
23.1%
083 32649
 
3.3%
32039
 
3.3%
e 21036
 
2.2%
077 20721
 
2.1%
081 18686
 
1.9%
080 18538
 
1.9%
076 17637
 
1.8%
078 17121
 
1.8%
079 16537
 
1.7%
Other values (2116) 555848
56.9%
2025-01-14T11:50:33.024961image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
726970
26.7%
0 378660
13.9%
W 225899
 
8.3%
8 184727
 
6.8%
3 182195
 
6.7%
7 168795
 
6.2%
1 163256
 
6.0%
2 149535
 
5.5%
4 149396
 
5.5%
5 148151
 
5.4%
Other values (15) 243093
 
8.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1668307
61.3%
Space Separator 726970
26.7%
Uppercase Letter 247315
 
9.1%
Dash Punctuation 65666
 
2.4%
Other Punctuation 12208
 
0.4%
Open Punctuation 73
 
< 0.1%
Close Punctuation 73
 
< 0.1%
Other Symbol 39
 
< 0.1%
Modifier Letter 24
 
< 0.1%
Final Punctuation 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 378660
22.7%
8 184727
11.1%
3 182195
10.9%
7 168795
10.1%
1 163256
9.8%
2 149535
 
9.0%
4 149396
 
9.0%
5 148151
 
8.9%
9 72593
 
4.4%
6 70999
 
4.3%
Other Punctuation
ValueCountFrequency (%)
. 11959
98.0%
' 144
 
1.2%
" 50
 
0.4%
? 38
 
0.3%
; 17
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
W 225899
91.3%
E 21415
 
8.7%
S 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
726970
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 65666
100.0%
Open Punctuation
ValueCountFrequency (%)
( 73
100.0%
Close Punctuation
ValueCountFrequency (%)
) 73
100.0%
Other Symbol
ValueCountFrequency (%)
° 39
100.0%
Modifier Letter
ValueCountFrequency (%)
ʹ 24
100.0%
Final Punctuation
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2473362
90.9%
Latin 247315
 
9.1%

Most frequent character per script

Common
ValueCountFrequency (%)
726970
29.4%
0 378660
15.3%
8 184727
 
7.5%
3 182195
 
7.4%
7 168795
 
6.8%
1 163256
 
6.6%
2 149535
 
6.0%
4 149396
 
6.0%
5 148151
 
6.0%
9 72593
 
2.9%
Other values (12) 149084
 
6.0%
Latin
ValueCountFrequency (%)
W 225899
91.3%
E 21415
 
8.7%
S 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2720612
> 99.9%
None 39
 
< 0.1%
Modifier Letters 24
 
< 0.1%
Punctuation 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
726970
26.7%
0 378660
13.9%
W 225899
 
8.3%
8 184727
 
6.8%
3 182195
 
6.7%
7 168795
 
6.2%
1 163256
 
6.0%
2 149535
 
5.5%
4 149396
 
5.5%
5 148151
 
5.4%
Other values (12) 243028
 
8.9%
None
ValueCountFrequency (%)
° 39
100.0%
Modifier Letters
ValueCountFrequency (%)
ʹ 24
100.0%
Punctuation
ValueCountFrequency (%)
2
100.0%

georeferenceProtocol
Text

Missing 

Distinct3371
Distinct (%)2.3%
Missing439136
Missing (%)75.2%
Memory size4.5 MiB
2025-01-14T11:50:33.212546image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length302
Median length251
Mean length91.26128977
Min length3

Characters and Unicode

Total characters13238819
Distinct characters86
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique891 ?
Unique (%)0.6%

Sample

1st rowUSGS Palo Alto Quad (TopoZone - 1:24,000), MaNIS/HerpNET/ORNIS Georeferencing Guidelines
2nd rowTerrain Navigator v. 5.03 USGS 1:24,000, MaNIS/HerpNET/ORNIS Georeferencing Guidelines
3rd rowAlexandria Digital Library Gazetteer, MaNIS/HerpNET/ORNIS Georeferencing Guidelines
4th rowUSGS Chesterfield Quad (TopoZine - 1:24,000), MaNIS/HerpNET/ORNIS Georeferencing Guidelines
5th rowUSGS Falls Church Quad (TopoZone - 1:24,000), MaNIS/HerpNET/ORNIS Georeferencing Guidelines
ValueCountFrequency (%)
georeferencing 134216
 
9.7%
manis/herpnet/ornis 134163
 
9.7%
guidelines 134143
 
9.7%
usgs 59079
 
4.3%
1:24,000 54333
 
3.9%
44136
 
3.2%
quad 39827
 
2.9%
digital 22588
 
1.6%
gazetteer 22105
 
1.6%
topozone 21638
 
1.6%
Other values (3792) 715459
51.8%
2025-01-14T11:50:33.476205image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 1320173
 
10.0%
1236622
 
9.3%
r 733799
 
5.5%
i 691510
 
5.2%
a 629206
 
4.8%
n 622138
 
4.7%
o 500801
 
3.8%
N 461182
 
3.5%
S 454207
 
3.4%
G 414644
 
3.1%
Other values (76) 6174537
46.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7136568
53.9%
Uppercase Letter 3060694
23.1%
Space Separator 1236622
 
9.3%
Decimal Number 835786
 
6.3%
Other Punctuation 760937
 
5.7%
Open Punctuation 71491
 
0.5%
Close Punctuation 71272
 
0.5%
Dash Punctuation 65161
 
0.5%
Connector Punctuation 248
 
< 0.1%
Math Symbol 40
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1320173
18.5%
r 733799
10.3%
i 691510
9.7%
a 629206
8.8%
n 622138
8.7%
o 500801
 
7.0%
l 307980
 
4.3%
d 294924
 
4.1%
t 258449
 
3.6%
g 250955
 
3.5%
Other values (19) 1526633
21.4%
Uppercase Letter
ValueCountFrequency (%)
N 461182
15.1%
S 454207
14.8%
G 414644
13.5%
I 303621
9.9%
T 221610
7.2%
M 189899
6.2%
E 166244
 
5.4%
O 161796
 
5.3%
R 151192
 
4.9%
H 140237
 
4.6%
Other values (17) 396062
12.9%
Other Punctuation
ValueCountFrequency (%)
/ 286534
37.7%
, 258402
34.0%
: 100996
 
13.3%
. 80708
 
10.6%
; 15057
 
2.0%
! 9034
 
1.2%
# 6647
 
0.9%
' 2637
 
0.3%
& 813
 
0.1%
? 94
 
< 0.1%
Other values (3) 15
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
0 379892
45.5%
1 133690
 
16.0%
2 100269
 
12.0%
4 76915
 
9.2%
5 38693
 
4.6%
7 25544
 
3.1%
9 22590
 
2.7%
6 22338
 
2.7%
3 22202
 
2.7%
8 13653
 
1.6%
Math Symbol
ValueCountFrequency (%)
+ 24
60.0%
= 16
40.0%
Space Separator
ValueCountFrequency (%)
1236622
100.0%
Open Punctuation
ValueCountFrequency (%)
( 71491
100.0%
Close Punctuation
ValueCountFrequency (%)
) 71272
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 65161
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 248
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10197262
77.0%
Common 3041557
 
23.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1320173
 
12.9%
r 733799
 
7.2%
i 691510
 
6.8%
a 629206
 
6.2%
n 622138
 
6.1%
o 500801
 
4.9%
N 461182
 
4.5%
S 454207
 
4.5%
G 414644
 
4.1%
l 307980
 
3.0%
Other values (46) 4061622
39.8%
Common
ValueCountFrequency (%)
1236622
40.7%
0 379892
 
12.5%
/ 286534
 
9.4%
, 258402
 
8.5%
1 133690
 
4.4%
: 100996
 
3.3%
2 100269
 
3.3%
. 80708
 
2.7%
4 76915
 
2.5%
( 71491
 
2.4%
Other values (20) 316038
 
10.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13234776
> 99.9%
None 4039
 
< 0.1%
Punctuation 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 1320173
 
10.0%
1236622
 
9.3%
r 733799
 
5.5%
i 691510
 
5.2%
a 629206
 
4.8%
n 622138
 
4.7%
o 500801
 
3.8%
N 461182
 
3.5%
S 454207
 
3.4%
G 414644
 
3.1%
Other values (71) 6170494
46.6%
None
ValueCountFrequency (%)
í 4030
99.8%
é 5
 
0.1%
ô 2
 
< 0.1%
Î 2
 
< 0.1%
Punctuation
ValueCountFrequency (%)
4
100.0%

georeferenceRemarks
Text

Missing 

Distinct3681
Distinct (%)2.6%
Missing443625
Missing (%)75.9%
Memory size4.5 MiB
2025-01-14T11:50:33.653850image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length83
Median length55
Mean length22.53162702
Min length7

Characters and Unicode

Total characters3167406
Distinct characters64
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1057 ?
Unique (%)0.8%

Sample

1st rowLocality extent = 0.05
2nd rowLocality extent = 95
3rd rowLocality extent = 3.5
4th rowDatum Guam 63
5th rowLocality extent = 1.08
ValueCountFrequency (%)
extent 134257
22.0%
134207
22.0%
locality 134203
22.0%
mi 40072
 
6.6%
km 8736
 
1.4%
0.1 7251
 
1.2%
datum 6200
 
1.0%
63 5497
 
0.9%
guam 5494
 
0.9%
1 5323
 
0.9%
Other values (2938) 128798
21.1%
2025-01-14T11:50:33.901119image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
469462
14.8%
t 411232
13.0%
e 269464
 
8.5%
i 175099
 
5.5%
. 149589
 
4.7%
a 146541
 
4.6%
l 134689
 
4.3%
n 134567
 
4.2%
o 134447
 
4.2%
y 134376
 
4.2%
Other values (54) 1007940
31.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1896549
59.9%
Space Separator 469462
 
14.8%
Decimal Number 368654
 
11.6%
Other Punctuation 149871
 
4.7%
Uppercase Letter 148496
 
4.7%
Math Symbol 134208
 
4.2%
Dash Punctuation 72
 
< 0.1%
Open Punctuation 47
 
< 0.1%
Close Punctuation 47
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 411232
21.7%
e 269464
14.2%
i 175099
9.2%
a 146541
 
7.7%
l 134689
 
7.1%
n 134567
 
7.1%
o 134447
 
7.1%
y 134376
 
7.1%
x 134300
 
7.1%
c 134263
 
7.1%
Other values (14) 87571
 
4.6%
Uppercase Letter
ValueCountFrequency (%)
L 134266
90.4%
G 6166
 
4.2%
D 6026
 
4.1%
S 774
 
0.5%
W 687
 
0.5%
H 144
 
0.1%
N 119
 
0.1%
P 107
 
0.1%
E 71
 
< 0.1%
A 37
 
< 0.1%
Other values (9) 99
 
0.1%
Decimal Number
ValueCountFrequency (%)
0 74829
20.3%
1 61579
16.7%
5 51221
13.9%
2 46439
12.6%
3 35147
9.5%
6 23996
 
6.5%
4 21925
 
5.9%
7 21708
 
5.9%
8 19177
 
5.2%
9 12633
 
3.4%
Other Punctuation
ValueCountFrequency (%)
. 149589
99.8%
; 174
 
0.1%
, 71
 
< 0.1%
: 19
 
< 0.1%
/ 12
 
< 0.1%
' 6
 
< 0.1%
Space Separator
ValueCountFrequency (%)
469462
100.0%
Math Symbol
ValueCountFrequency (%)
= 134208
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 72
100.0%
Open Punctuation
ValueCountFrequency (%)
( 47
100.0%
Close Punctuation
ValueCountFrequency (%)
) 47
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2045045
64.6%
Common 1122361
35.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 411232
20.1%
e 269464
13.2%
i 175099
8.6%
a 146541
 
7.2%
l 134689
 
6.6%
n 134567
 
6.6%
o 134447
 
6.6%
y 134376
 
6.6%
x 134300
 
6.6%
L 134266
 
6.6%
Other values (33) 236064
11.5%
Common
ValueCountFrequency (%)
469462
41.8%
. 149589
 
13.3%
= 134208
 
12.0%
0 74829
 
6.7%
1 61579
 
5.5%
5 51221
 
4.6%
2 46439
 
4.1%
3 35147
 
3.1%
6 23996
 
2.1%
4 21925
 
2.0%
Other values (11) 53966
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3167406
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
469462
14.8%
t 411232
13.0%
e 269464
 
8.5%
i 175099
 
5.5%
. 149589
 
4.7%
a 146541
 
4.6%
l 134689
 
4.3%
n 134567
 
4.2%
o 134447
 
4.2%
y 134376
 
4.2%
Other values (54) 1007940
31.8%
Distinct3
Distinct (%)0.7%
Missing583784
Missing (%)99.9%
Memory size4.5 MiB
2025-01-14T11:50:33.953470image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length3
Mean length3.167865707
Min length3

Characters and Unicode

Total characters1321
Distinct characters10
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowaff.
2nd rowcf.
3rd rowcf.
4th rowcf.
5th rowcf.
ValueCountFrequency (%)
cf 382
91.6%
aff 28
 
6.7%
uncertain 7
 
1.7%
2025-01-14T11:50:34.049979image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
f 438
33.2%
. 410
31.0%
c 389
29.4%
a 35
 
2.6%
n 14
 
1.1%
u 7
 
0.5%
e 7
 
0.5%
r 7
 
0.5%
t 7
 
0.5%
i 7
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 911
69.0%
Other Punctuation 410
31.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
f 438
48.1%
c 389
42.7%
a 35
 
3.8%
n 14
 
1.5%
u 7
 
0.8%
e 7
 
0.8%
r 7
 
0.8%
t 7
 
0.8%
i 7
 
0.8%
Other Punctuation
ValueCountFrequency (%)
. 410
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 911
69.0%
Common 410
31.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
f 438
48.1%
c 389
42.7%
a 35
 
3.8%
n 14
 
1.5%
u 7
 
0.8%
e 7
 
0.8%
r 7
 
0.8%
t 7
 
0.8%
i 7
 
0.8%
Common
ValueCountFrequency (%)
. 410
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1321
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
f 438
33.2%
. 410
31.0%
c 389
29.4%
a 35
 
2.6%
n 14
 
1.1%
u 7
 
0.5%
e 7
 
0.5%
r 7
 
0.5%
t 7
 
0.5%
i 7
 
0.5%

typeStatus
Text

Missing 

Distinct14
Distinct (%)0.1%
Missing570681
Missing (%)97.7%
Memory size4.5 MiB
2025-01-14T11:50:34.094496image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length27
Median length8
Mean length8.390310651
Min length7

Characters and Unicode

Total characters113437
Distinct characters17
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)< 0.1%

Sample

1st rowParatype
2nd rowParatype
3rd rowParatype
4th rowParatype
5th rowParalectotype
ValueCountFrequency (%)
paratype 10833
77.9%
holotype 1225
 
8.8%
syntype 1222
 
8.8%
paralectotype 502
 
3.6%
lectotype 104
 
0.7%
neotype 25
 
0.2%
2025-01-14T11:50:34.208075image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 22670
20.0%
y 15133
13.3%
e 14542
12.8%
t 14517
12.8%
p 13911
12.3%
P 11335
10.0%
r 11335
10.0%
o 3081
 
2.7%
l 1727
 
1.5%
H 1225
 
1.1%
Other values (7) 3961
 
3.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 98744
87.0%
Uppercase Letter 13911
 
12.3%
Other Punctuation 391
 
0.3%
Space Separator 391
 
0.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 22670
23.0%
y 15133
15.3%
e 14542
14.7%
t 14517
14.7%
p 13911
14.1%
r 11335
11.5%
o 3081
 
3.1%
l 1727
 
1.7%
n 1222
 
1.2%
c 606
 
0.6%
Uppercase Letter
ValueCountFrequency (%)
P 11335
81.5%
H 1225
 
8.8%
S 1222
 
8.8%
L 104
 
0.7%
N 25
 
0.2%
Other Punctuation
ValueCountFrequency (%)
; 391
100.0%
Space Separator
ValueCountFrequency (%)
391
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 112655
99.3%
Common 782
 
0.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 22670
20.1%
y 15133
13.4%
e 14542
12.9%
t 14517
12.9%
p 13911
12.3%
P 11335
10.1%
r 11335
10.1%
o 3081
 
2.7%
l 1727
 
1.5%
H 1225
 
1.1%
Other values (5) 3179
 
2.8%
Common
ValueCountFrequency (%)
; 391
50.0%
391
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 113437
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 22670
20.0%
y 15133
13.3%
e 14542
12.8%
t 14517
12.8%
p 13911
12.3%
P 11335
10.0%
r 11335
10.0%
o 3081
 
2.7%
l 1727
 
1.5%
H 1225
 
1.1%
Other values (7) 3961
 
3.5%

identifiedBy
Text

Missing 

Distinct8
Distinct (%)10.5%
Missing584125
Missing (%)> 99.9%
Memory size4.5 MiB
2025-01-14T11:50:34.281686image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length122
Median length18
Mean length25.17105263
Min length14

Characters and Unicode

Total characters1913
Distinct characters49
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)5.3%

Sample

1st rowGower, David, (BMNH), The Natural History Museum (UNITED KINGDOM)
2nd rowCrombie, Ronald I.
3rd rowCrombie, Ronald I.
4th rowCrombie, Ronald I.
5th rowCrombie, Ronald I.
ValueCountFrequency (%)
ronald 56
18.7%
crombie 55
18.3%
i 55
18.3%
natural 11
 
3.7%
history 11
 
3.7%
museum 11
 
3.7%
united 11
 
3.7%
gower 10
 
3.3%
david 10
 
3.3%
bmnh 10
 
3.3%
Other values (26) 60
20.0%
2025-01-14T11:50:34.412014image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
224
 
11.7%
o 146
 
7.6%
e 102
 
5.3%
r 99
 
5.2%
, 98
 
5.1%
a 95
 
5.0%
i 87
 
4.5%
I 77
 
4.0%
n 73
 
3.8%
d 73
 
3.8%
Other values (39) 839
43.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1027
53.7%
Uppercase Letter 452
23.6%
Space Separator 224
 
11.7%
Other Punctuation 163
 
8.5%
Close Punctuation 22
 
1.2%
Open Punctuation 22
 
1.2%
Dash Punctuation 3
 
0.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I 77
17.0%
R 61
13.5%
C 58
12.8%
N 43
9.5%
M 31
6.9%
D 31
6.9%
H 27
 
6.0%
G 24
 
5.3%
T 23
 
5.1%
E 14
 
3.1%
Other values (12) 63
13.9%
Lowercase Letter
ValueCountFrequency (%)
o 146
14.2%
e 102
9.9%
r 99
9.6%
a 95
9.3%
i 87
8.5%
n 73
7.1%
d 73
7.1%
l 69
6.7%
m 68
6.6%
b 56
 
5.5%
Other values (11) 159
15.5%
Other Punctuation
ValueCountFrequency (%)
, 98
60.1%
. 65
39.9%
Space Separator
ValueCountFrequency (%)
224
100.0%
Close Punctuation
ValueCountFrequency (%)
) 22
100.0%
Open Punctuation
ValueCountFrequency (%)
( 22
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1479
77.3%
Common 434
 
22.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 146
 
9.9%
e 102
 
6.9%
r 99
 
6.7%
a 95
 
6.4%
i 87
 
5.9%
I 77
 
5.2%
n 73
 
4.9%
d 73
 
4.9%
l 69
 
4.7%
m 68
 
4.6%
Other values (33) 590
39.9%
Common
ValueCountFrequency (%)
224
51.6%
, 98
22.6%
. 65
 
15.0%
) 22
 
5.1%
( 22
 
5.1%
- 3
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1913
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
224
 
11.7%
o 146
 
7.6%
e 102
 
5.3%
r 99
 
5.2%
, 98
 
5.1%
a 95
 
5.0%
i 87
 
4.5%
I 77
 
4.0%
n 73
 
3.8%
d 73
 
3.8%
Other values (39) 839
43.9%
Distinct9530
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-14T11:50:34.600170image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length62
Median length56
Mean length19.84556343
Min length4

Characters and Unicode

Total characters11593798
Distinct characters59
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1890 ?
Unique (%)0.3%

Sample

1st rowCarlia bicarinata
2nd rowPlethodon montanus
3rd rowEnhydris enhydris
4th rowGehyra mutilata
5th rowAnolis richardii
ValueCountFrequency (%)
plethodon 168423
 
14.0%
cinereus 75774
 
6.3%
desmognathus 35846
 
3.0%
anolis 18352
 
1.5%
glutinosus 13372
 
1.1%
lithobates 12991
 
1.1%
fuscus 11321
 
0.9%
montanus 10417
 
0.9%
eleutherodactylus 9959
 
0.8%
anaxyrus 9474
 
0.8%
Other values (7195) 837184
69.6%
2025-01-14T11:50:34.868768image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 976778
 
8.4%
o 954528
 
8.2%
s 947788
 
8.2%
a 896948
 
7.7%
i 821046
 
7.1%
n 729826
 
6.3%
t 711935
 
6.1%
l 642687
 
5.5%
u 635392
 
5.5%
r 633897
 
5.5%
Other values (49) 3642973
31.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10381385
89.5%
Space Separator 618912
 
5.3%
Uppercase Letter 582830
 
5.0%
Other Punctuation 10078
 
0.1%
Dash Punctuation 561
 
< 0.1%
Open Punctuation 16
 
< 0.1%
Close Punctuation 16
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 976778
9.4%
o 954528
 
9.2%
s 947788
 
9.1%
a 896948
 
8.6%
i 821046
 
7.9%
n 729826
 
7.0%
t 711935
 
6.9%
l 642687
 
6.2%
u 635392
 
6.1%
r 633897
 
6.1%
Other values (16) 2430560
23.4%
Uppercase Letter
ValueCountFrequency (%)
P 210261
36.1%
A 59585
 
10.2%
D 48691
 
8.4%
L 39048
 
6.7%
E 33682
 
5.8%
S 33117
 
5.7%
C 32240
 
5.5%
H 26213
 
4.5%
T 17139
 
2.9%
R 13689
 
2.3%
Other values (15) 69165
 
11.9%
Other Punctuation
ValueCountFrequency (%)
" 8496
84.3%
. 1545
 
15.3%
/ 21
 
0.2%
? 16
 
0.2%
Space Separator
ValueCountFrequency (%)
618912
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 561
100.0%
Open Punctuation
ValueCountFrequency (%)
( 16
100.0%
Close Punctuation
ValueCountFrequency (%)
) 16
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10964215
94.6%
Common 629583
 
5.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 976778
 
8.9%
o 954528
 
8.7%
s 947788
 
8.6%
a 896948
 
8.2%
i 821046
 
7.5%
n 729826
 
6.7%
t 711935
 
6.5%
l 642687
 
5.9%
u 635392
 
5.8%
r 633897
 
5.8%
Other values (41) 3013390
27.5%
Common
ValueCountFrequency (%)
618912
98.3%
" 8496
 
1.3%
. 1545
 
0.2%
- 561
 
0.1%
/ 21
 
< 0.1%
( 16
 
< 0.1%
? 16
 
< 0.1%
) 16
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11593798
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 976778
 
8.4%
o 954528
 
8.2%
s 947788
 
8.2%
a 896948
 
7.7%
i 821046
 
7.1%
n 729826
 
6.3%
t 711935
 
6.1%
l 642687
 
5.5%
u 635392
 
5.5%
r 633897
 
5.5%
Other values (49) 3642973
31.4%
Distinct167
Distinct (%)< 0.1%
Missing2
Missing (%)< 0.1%
Memory size4.5 MiB
2025-01-14T11:50:35.049546image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length86
Median length82
Mean length66.44007265
Min length10

Characters and Unicode

Total characters38814224
Distinct characters46
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st rowAnimalia, Chordata, Vertebrata, Reptilia, Squamata, Sauria, Scincidae, Eugongylinae
2nd rowAnimalia, Chordata, Vertebrata, Amphibia, Caudata, Plethodontidae
3rd rowAnimalia, Chordata, Vertebrata, Reptilia, Squamata, Ophidia, Homalopsinae
4th rowAnimalia, Chordata, Vertebrata, Reptilia, Squamata, Sauria, Gekkoninae
5th rowAnimalia, Chordata, Vertebrata, Reptilia, Squamata, Sauria, Polychrotinae
ValueCountFrequency (%)
animalia 584195
15.7%
vertebrata 584195
15.7%
chordata 584178
15.7%
amphibia 395159
10.6%
caudata 237127
6.4%
plethodontidae 221369
 
5.9%
reptilia 189036
 
5.1%
squamata 169309
 
4.5%
anura 157511
 
4.2%
sauria 116154
 
3.1%
Other values (166) 484544
13.0%
2025-01-14T11:50:35.294474image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 6566805
16.9%
i 3313617
 
8.5%
, 3138578
 
8.1%
3138578
 
8.1%
t 3000106
 
7.7%
e 2360956
 
6.1%
r 2244920
 
5.8%
d 1648115
 
4.2%
h 1357195
 
3.5%
n 1355739
 
3.5%
Other values (36) 10689615
27.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 28814291
74.2%
Uppercase Letter 3722777
 
9.6%
Other Punctuation 3138578
 
8.1%
Space Separator 3138578
 
8.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 6566805
22.8%
i 3313617
11.5%
t 3000106
10.4%
e 2360956
 
8.2%
r 2244920
 
7.8%
d 1648115
 
5.7%
h 1357195
 
4.7%
n 1355739
 
4.7%
o 1350378
 
4.7%
m 1224848
 
4.3%
Other values (14) 4391612
15.2%
Uppercase Letter
ValueCountFrequency (%)
A 1151930
30.9%
C 876519
23.5%
V 590792
15.9%
S 343033
 
9.2%
P 265039
 
7.1%
R 211930
 
5.7%
O 52750
 
1.4%
H 46430
 
1.2%
E 33840
 
0.9%
T 33424
 
0.9%
Other values (10) 117090
 
3.1%
Other Punctuation
ValueCountFrequency (%)
, 3138578
100.0%
Space Separator
ValueCountFrequency (%)
3138578
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 32537068
83.8%
Common 6277156
 
16.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 6566805
20.2%
i 3313617
10.2%
t 3000106
 
9.2%
e 2360956
 
7.3%
r 2244920
 
6.9%
d 1648115
 
5.1%
h 1357195
 
4.2%
n 1355739
 
4.2%
o 1350378
 
4.2%
m 1224848
 
3.8%
Other values (34) 8114389
24.9%
Common
ValueCountFrequency (%)
, 3138578
50.0%
3138578
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 38814224
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 6566805
16.9%
i 3313617
 
8.5%
, 3138578
 
8.1%
3138578
 
8.1%
t 3000106
 
7.7%
e 2360956
 
6.1%
r 2244920
 
5.8%
d 1648115
 
4.2%
h 1357195
 
3.5%
n 1355739
 
3.5%
Other values (36) 10689615
27.5%

kingdom
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing6
Missing (%)< 0.1%
Memory size4.5 MiB
2025-01-14T11:50:35.347206image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters4673560
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAnimalia
2nd rowAnimalia
3rd rowAnimalia
4th rowAnimalia
5th rowAnimalia
ValueCountFrequency (%)
animalia 584195
100.0%
2025-01-14T11:50:35.445644image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 1168390
25.0%
a 1168390
25.0%
A 584195
12.5%
n 584195
12.5%
m 584195
12.5%
l 584195
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4089365
87.5%
Uppercase Letter 584195
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 1168390
28.6%
a 1168390
28.6%
n 584195
14.3%
m 584195
14.3%
l 584195
14.3%
Uppercase Letter
ValueCountFrequency (%)
A 584195
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4673560
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 1168390
25.0%
a 1168390
25.0%
A 584195
12.5%
n 584195
12.5%
m 584195
12.5%
l 584195
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4673560
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 1168390
25.0%
a 1168390
25.0%
A 584195
12.5%
n 584195
12.5%
m 584195
12.5%
l 584195
12.5%

phylum
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing23
Missing (%)< 0.1%
Memory size4.5 MiB
2025-01-14T11:50:35.488975image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters4673424
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowChordata
2nd rowChordata
3rd rowChordata
4th rowChordata
5th rowChordata
ValueCountFrequency (%)
chordata 584178
100.0%
2025-01-14T11:50:35.587702image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1168356
25.0%
C 584178
12.5%
h 584178
12.5%
o 584178
12.5%
r 584178
12.5%
d 584178
12.5%
t 584178
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4089246
87.5%
Uppercase Letter 584178
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1168356
28.6%
h 584178
14.3%
o 584178
14.3%
r 584178
14.3%
d 584178
14.3%
t 584178
14.3%
Uppercase Letter
ValueCountFrequency (%)
C 584178
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4673424
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1168356
25.0%
C 584178
12.5%
h 584178
12.5%
o 584178
12.5%
r 584178
12.5%
d 584178
12.5%
t 584178
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4673424
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1168356
25.0%
C 584178
12.5%
h 584178
12.5%
o 584178
12.5%
r 584178
12.5%
d 584178
12.5%
t 584178
12.5%

class
Text

Distinct2
Distinct (%)< 0.1%
Missing6
Missing (%)< 0.1%
Memory size4.5 MiB
2025-01-14T11:50:35.630722image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters4673560
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowReptilia
2nd rowAmphibia
3rd rowReptilia
4th rowReptilia
5th rowReptilia
ValueCountFrequency (%)
amphibia 395159
67.6%
reptilia 189036
32.4%
2025-01-14T11:50:35.727311image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 1168390
25.0%
p 584195
12.5%
a 584195
12.5%
A 395159
 
8.5%
m 395159
 
8.5%
h 395159
 
8.5%
b 395159
 
8.5%
R 189036
 
4.0%
e 189036
 
4.0%
t 189036
 
4.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4089365
87.5%
Uppercase Letter 584195
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 1168390
28.6%
p 584195
14.3%
a 584195
14.3%
m 395159
 
9.7%
h 395159
 
9.7%
b 395159
 
9.7%
e 189036
 
4.6%
t 189036
 
4.6%
l 189036
 
4.6%
Uppercase Letter
ValueCountFrequency (%)
A 395159
67.6%
R 189036
32.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 4673560
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 1168390
25.0%
p 584195
12.5%
a 584195
12.5%
A 395159
 
8.5%
m 395159
 
8.5%
h 395159
 
8.5%
b 395159
 
8.5%
R 189036
 
4.0%
e 189036
 
4.0%
t 189036
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4673560
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 1168390
25.0%
p 584195
12.5%
a 584195
12.5%
A 395159
 
8.5%
m 395159
 
8.5%
h 395159
 
8.5%
b 395159
 
8.5%
R 189036
 
4.0%
e 189036
 
4.0%
t 189036
 
4.0%

order
Text

Distinct7
Distinct (%)< 0.1%
Missing6
Missing (%)< 0.1%
Memory size4.5 MiB
2025-01-14T11:50:35.774785image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length15
Median length11
Mean length6.855565351
Min length5

Characters and Unicode

Total characters4004987
Distinct characters23
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSquamata
2nd rowCaudata
3rd rowSquamata
4th rowSquamata
5th rowSquamata
ValueCountFrequency (%)
caudata 237127
40.6%
squamata 169309
29.0%
anura 157511
27.0%
testudines 18909
 
3.2%
crocodilia 804
 
0.1%
gymnophiona 521
 
0.1%
rhynchocephalia 14
 
< 0.1%
2025-01-14T11:50:35.876114image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1378172
34.4%
u 582856
14.6%
t 425345
 
10.6%
d 256840
 
6.4%
C 237931
 
5.9%
n 177476
 
4.4%
m 169830
 
4.2%
S 169309
 
4.2%
q 169309
 
4.2%
r 158315
 
4.0%
Other values (13) 279604
 
7.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3420792
85.4%
Uppercase Letter 584195
 
14.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1378172
40.3%
u 582856
17.0%
t 425345
 
12.4%
d 256840
 
7.5%
n 177476
 
5.2%
m 169830
 
5.0%
q 169309
 
4.9%
r 158315
 
4.6%
e 37832
 
1.1%
s 37818
 
1.1%
Other values (7) 26999
 
0.8%
Uppercase Letter
ValueCountFrequency (%)
C 237931
40.7%
S 169309
29.0%
A 157511
27.0%
T 18909
 
3.2%
G 521
 
0.1%
R 14
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 4004987
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1378172
34.4%
u 582856
14.6%
t 425345
 
10.6%
d 256840
 
6.4%
C 237931
 
5.9%
n 177476
 
4.4%
m 169830
 
4.2%
S 169309
 
4.2%
q 169309
 
4.2%
r 158315
 
4.0%
Other values (13) 279604
 
7.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4004987
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1378172
34.4%
u 582856
14.6%
t 425345
 
10.6%
d 256840
 
6.4%
C 237931
 
5.9%
n 177476
 
4.4%
m 169830
 
4.2%
S 169309
 
4.2%
q 169309
 
4.2%
r 158315
 
4.0%
Other values (13) 279604
 
7.0%

family
Text

Distinct146
Distinct (%)< 0.1%
Missing183
Missing (%)< 0.1%
Memory size4.5 MiB
2025-01-14T11:50:36.021273image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length20
Median length19
Mean length12.11108562
Min length6

Characters and Unicode

Total characters7073092
Distinct characters42
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowScincidae
2nd rowPlethodontidae
3rd rowHomalopsinae
4th rowGekkoninae
5th rowPolychrotinae
ValueCountFrequency (%)
plethodontidae 221369
37.9%
hylinae 41496
 
7.1%
scincidae 26137
 
4.5%
bufonidae 25125
 
4.3%
ranidae 20319
 
3.5%
polychrotinae 18552
 
3.2%
gekkoninae 17268
 
3.0%
phrynosomatinae 16259
 
2.8%
colubrinae 15640
 
2.7%
natricinae 12705
 
2.2%
Other values (136) 169148
29.0%
2025-01-14T11:50:36.240522image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 930659
13.2%
a 759819
10.7%
d 735567
10.4%
o 715098
10.1%
i 680312
9.6%
t 609836
8.6%
n 542175
7.7%
l 383862
 
5.4%
h 316462
 
4.5%
P 264677
 
3.7%
Other values (32) 1134625
16.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6489074
91.7%
Uppercase Letter 584018
 
8.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 930659
14.3%
a 759819
11.7%
d 735567
11.3%
o 715098
11.0%
i 680312
10.5%
t 609836
9.4%
n 542175
8.4%
l 383862
5.9%
h 316462
 
4.9%
r 170638
 
2.6%
Other values (12) 644646
9.9%
Uppercase Letter
ValueCountFrequency (%)
P 264677
45.3%
S 49954
 
8.6%
H 46430
 
8.0%
C 31134
 
5.3%
B 27124
 
4.6%
R 22836
 
3.9%
G 21056
 
3.6%
E 19510
 
3.3%
L 16895
 
2.9%
D 14735
 
2.5%
Other values (10) 69667
 
11.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 7073092
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 930659
13.2%
a 759819
10.7%
d 735567
10.4%
o 715098
10.1%
i 680312
9.6%
t 609836
8.6%
n 542175
7.7%
l 383862
 
5.4%
h 316462
 
4.5%
P 264677
 
3.7%
Other values (32) 1134625
16.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7073092
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 930659
13.2%
a 759819
10.7%
d 735567
10.4%
o 715098
10.1%
i 680312
9.6%
t 609836
8.6%
n 542175
7.7%
l 383862
 
5.4%
h 316462
 
4.5%
P 264677
 
3.7%
Other values (32) 1134625
16.0%

genus
Text

Distinct1387
Distinct (%)0.2%
Missing2
Missing (%)< 0.1%
Memory size4.5 MiB
2025-01-14T11:50:36.432069image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length16
Mean length9.509797175
Min length3

Characters and Unicode

Total characters5555614
Distinct characters51
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique139 ?
Unique (%)< 0.1%

Sample

1st rowCarlia
2nd rowPlethodon
3rd rowEnhydris
4th rowGehyra
5th rowAnolis
ValueCountFrequency (%)
plethodon 168423
28.8%
desmognathus 35844
 
6.1%
anolis 18333
 
3.1%
lithobates 12993
 
2.2%
eleutherodactylus 9947
 
1.7%
anaxyrus 9476
 
1.6%
sceloporus 8824
 
1.5%
emoia 8211
 
1.4%
eurycea 7626
 
1.3%
pseudacris 6800
 
1.2%
Other values (1376) 297722
51.0%
2025-01-14T11:50:36.687118image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 674870
12.1%
e 454080
 
8.2%
t 412656
 
7.4%
s 399702
 
7.2%
l 372025
 
6.7%
a 366974
 
6.6%
h 357871
 
6.4%
n 346801
 
6.2%
d 269766
 
4.9%
i 237466
 
4.3%
Other values (41) 1663403
29.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4972908
89.5%
Uppercase Letter 582706
 
10.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 674870
13.6%
e 454080
9.1%
t 412656
 
8.3%
s 399702
 
8.0%
l 372025
 
7.5%
a 366974
 
7.4%
h 357871
 
7.2%
n 346801
 
7.0%
d 269766
 
5.4%
i 237466
 
4.8%
Other values (16) 1080697
21.7%
Uppercase Letter
ValueCountFrequency (%)
P 210245
36.1%
A 59587
 
10.2%
D 48689
 
8.4%
L 39050
 
6.7%
E 33523
 
5.8%
S 33115
 
5.7%
C 32240
 
5.5%
H 26213
 
4.5%
T 17139
 
2.9%
R 13671
 
2.3%
Other values (15) 69234
 
11.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 5555614
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 674870
12.1%
e 454080
 
8.2%
t 412656
 
7.4%
s 399702
 
7.2%
l 372025
 
6.7%
a 366974
 
6.6%
h 357871
 
6.4%
n 346801
 
6.2%
d 269766
 
4.9%
i 237466
 
4.3%
Other values (41) 1663403
29.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5555614
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 674870
12.1%
e 454080
 
8.2%
t 412656
 
7.4%
s 399702
 
7.2%
l 372025
 
6.7%
a 366974
 
6.6%
h 357871
 
6.4%
n 346801
 
6.2%
d 269766
 
4.9%
i 237466
 
4.3%
Other values (41) 1663403
29.9%

specificEpithet
Text

Missing 

Distinct5168
Distinct (%)0.9%
Missing13122
Missing (%)2.2%
Memory size4.5 MiB
2025-01-14T11:50:36.877200image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length22
Mean length8.884525609
Min length3

Characters and Unicode

Total characters5073766
Distinct characters30
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique760 ?
Unique (%)0.1%

Sample

1st rowbicarinata
2nd rowmontanus
3rd rowenhydris
4th rowmutilata
5th rowrichardii
ValueCountFrequency (%)
cinereus 75774
 
13.2%
glutinosus 13098
 
2.3%
fuscus 10921
 
1.9%
montanus 10396
 
1.8%
jordani 7140
 
1.2%
metcalfi 6940
 
1.2%
cylindraceus 6103
 
1.1%
carolinensis 5850
 
1.0%
teyahalee 5559
 
1.0%
septentrionalis 4873
 
0.8%
Other values (5117) 427892
74.5%
2025-01-14T11:50:37.129385image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 546156
10.8%
s 515888
10.2%
e 491179
9.7%
a 489969
9.7%
r 404981
 
8.0%
u 401041
 
7.9%
n 359623
 
7.1%
c 308944
 
6.1%
t 280546
 
5.5%
o 262530
 
5.2%
Other values (20) 1012909
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5063007
99.8%
Other Punctuation 6731
 
0.1%
Space Separator 3467
 
0.1%
Dash Punctuation 561
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 546156
10.8%
s 515888
10.2%
e 491179
9.7%
a 489969
9.7%
r 404981
8.0%
u 401041
7.9%
n 359623
 
7.1%
c 308944
 
6.1%
t 280546
 
5.5%
o 262530
 
5.2%
Other values (16) 1002150
19.8%
Other Punctuation
ValueCountFrequency (%)
" 6710
99.7%
/ 21
 
0.3%
Space Separator
ValueCountFrequency (%)
3467
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 561
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5063007
99.8%
Common 10759
 
0.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 546156
10.8%
s 515888
10.2%
e 491179
9.7%
a 489969
9.7%
r 404981
8.0%
u 401041
7.9%
n 359623
 
7.1%
c 308944
 
6.1%
t 280546
 
5.5%
o 262530
 
5.2%
Other values (16) 1002150
19.8%
Common
ValueCountFrequency (%)
" 6710
62.4%
3467
32.2%
- 561
 
5.2%
/ 21
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5073766
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 546156
10.8%
s 515888
10.2%
e 491179
9.7%
a 489969
9.7%
r 404981
 
8.0%
u 401041
 
7.9%
n 359623
 
7.1%
c 308944
 
6.1%
t 280546
 
5.5%
o 262530
 
5.2%
Other values (20) 1012909
20.0%

infraspecificEpithet
Text

Missing 

Distinct1460
Distinct (%)5.2%
Missing556206
Missing (%)95.2%
Memory size4.5 MiB
2025-01-14T11:50:37.330966image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length30
Median length22
Mean length9.076299339
Min length3

Characters and Unicode

Total characters254091
Distinct characters27
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique314 ?
Unique (%)1.1%

Sample

1st rowoccidentalis
2nd rowcurta
3rd rowconsobrinus
4th rowtrinidadensis
5th rowignigularis
ValueCountFrequency (%)
viridescens 1460
 
5.2%
blanchardi 1211
 
4.3%
fasciata 1043
 
3.7%
elegans 911
 
3.2%
undulatus 640
 
2.3%
ordinatus 395
 
1.4%
stejnegeri 390
 
1.4%
louisianensis 365
 
1.3%
dorsalis 343
 
1.2%
fuscus 318
 
1.1%
Other values (1442) 21119
74.9%
2025-01-14T11:50:37.590727image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 29397
11.6%
a 29037
11.4%
s 26503
10.4%
e 19772
 
7.8%
n 17844
 
7.0%
r 16955
 
6.7%
u 15678
 
6.2%
l 14637
 
5.8%
t 13777
 
5.4%
o 13517
 
5.3%
Other values (17) 56974
22.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 253891
99.9%
Space Separator 200
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 29397
11.6%
a 29037
11.4%
s 26503
10.4%
e 19772
 
7.8%
n 17844
 
7.0%
r 16955
 
6.7%
u 15678
 
6.2%
l 14637
 
5.8%
t 13777
 
5.4%
o 13517
 
5.3%
Other values (16) 56774
22.4%
Space Separator
ValueCountFrequency (%)
200
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 253891
99.9%
Common 200
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 29397
11.6%
a 29037
11.4%
s 26503
10.4%
e 19772
 
7.8%
n 17844
 
7.0%
r 16955
 
6.7%
u 15678
 
6.2%
l 14637
 
5.8%
t 13777
 
5.4%
o 13517
 
5.3%
Other values (16) 56774
22.4%
Common
ValueCountFrequency (%)
200
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 254091
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 29397
11.6%
a 29037
11.4%
s 26503
10.4%
e 19772
 
7.8%
n 17844
 
7.0%
r 16955
 
6.7%
u 15678
 
6.2%
l 14637
 
5.8%
t 13777
 
5.4%
o 13517
 
5.3%
Other values (17) 56974
22.4%

taxonRank
Text

Constant  Missing 

Distinct1
Distinct (%)< 0.1%
Missing556206
Missing (%)95.2%
Memory size4.5 MiB
2025-01-14T11:50:37.648230image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters279950
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowsubspecies
2nd rowsubspecies
3rd rowsubspecies
4th rowsubspecies
5th rowsubspecies
ValueCountFrequency (%)
subspecies 27995
100.0%
2025-01-14T11:50:37.740050image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 83985
30.0%
e 55990
20.0%
u 27995
 
10.0%
b 27995
 
10.0%
p 27995
 
10.0%
c 27995
 
10.0%
i 27995
 
10.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 279950
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 83985
30.0%
e 55990
20.0%
u 27995
 
10.0%
b 27995
 
10.0%
p 27995
 
10.0%
c 27995
 
10.0%
i 27995
 
10.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 279950
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 83985
30.0%
e 55990
20.0%
u 27995
 
10.0%
b 27995
 
10.0%
p 27995
 
10.0%
c 27995
 
10.0%
i 27995
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 279950
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 83985
30.0%
e 55990
20.0%
u 27995
 
10.0%
b 27995
 
10.0%
p 27995
 
10.0%
c 27995
 
10.0%
i 27995
 
10.0%